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IDENTIFICATION OF STREPTOCOCCUS PNEUMONIAE SEROTYPES 

FIELD OF THE INVENTION 

The present invention relates to molecular methods of typing Streptococcus 
5 pneumoniae, as well as polynucleotides useful in such methods. 

BACKGROUND OF THE INVENTION 

Streptococcus pneumoniae is a leading cause of morbidity and mortality causing 
invasive disease such as meningitis and pneumonia as well as more localised disease 

10 such as acute otitis media and sinusitis. Polysaccharide and protein-conjugate 
pneumococcal vaccines have the potential to prevent a significant proportion of cases. 
Effective protein-conjugate vaccines are particularly important because of the dramatic 
increase in prevalence and international dissemination of antibiotic resistant 5. 
pneumoniae serotypes that commonly cause invasive disease in children (HausdorfF et 

15 el, 2001; Huebner, et al., 2000). However these vaccines protect against only the 
relatively small minority (Dunne et al., 2001; HausdorfF et el., 2001) of pneumococcal 
serotypes that most commonly cause disease. There is theoretical and limited empirical 
evidence that widespread use of these vaccines could lead to substitution of "vaccine*' 
serotypes with other nonvaccine serotypes, against which the vaccines to not provide 

20 protection. Continued surveillance will be critical to monitor vaccine efficacy and 
changes in incidence and distribution of colonising and invasive serotypes (HausdorfF 
et el., 2001; Rubins et al., 1999). Any increase in disease caused by previously 
uncommon nonvaccine serotypes could necessitate a change in vaccine composition 
(Lipsitch, 2001). 

25 S. pneumoniae comprises at least 90 serotypes, distinguished by capsular 

polysaccharide antigens. The capsular polysaccharide synthesis (cps) gene clusters for 
at least 16 pneumococcal serotypes have been sequenced and serotype-specific genes 
identified (Jiang et al., 2001; van Selm et al., 2002). The cps gene cluster contains 
genes responsible for synthesis, of the serotype-specific polysaccharide including - 

30 except in serotype 3 - wzy (polysaccharide polymerase gene) and wzx (polysccharide 
flippase gene). At the 5'-end of the cps gene cluster are four relatively conserved open 
reading frames - cpsA (wzg)-cpsB (yvzh)-cpsC (wzd)-cpsD (wze). Sequence differences 
in this region were used to classify 115. pneumoniae serotypes into two classes and, in 
the region between the 3 '-end of cpsA and the 5'-end of cpsB t there were sites of 

35 heterogeneity between and within serotypes (Jiang et al., 2001; Lawrence et al., 2000). 
S. pneumoniae is characterised by high frequency recombination within the cps gene 
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cluster, leading to serotype "switching" among isolates within genetic lineages defined 
by relationships between their more conserved housekeeping genes (Coffey et al, 
1998; Jiang et aL, 2001). 

Pneumococcal serogroup/type identification is currently performed, using large 
5 panels of expensive antisera, by various methods, including capsular swelling 
(Quellung) reaction - the traditional "gold standard"- latex agglutination and 
coagglutination (Arai et al., 2001; Lalitha et al., 1999). Cross-reactions between 
serotypes and discrepancies between methods can occur and some strains are 
nonserotypable (Henrichsen, 1999). 
10 There is a need for further methods which can be used to identify different 

Streptococcus pneumoniae serotypes. 

SUMMARY OF THE INVENTION 

Through the complex analysis of a large number of polymorphisms which exist 
15 between 71 molecular capsular types (met) and subtypes (mcst) of Streptococcus 
pneumoniae the present inventors have devised methods which can be used to 
distinguish between a significant number of different S. pneumoniae serotypes. 

In a first aspect, the present invention provides a method of determining the 
serotype of Streptococcus pneumoniae in a sample, the method comprising analysing at 
20 least a portion of the nucleotide sequence between the 3 1 end of the cpsA gene and the 
5' end of the cpsB gene. 

In a preferred embodiment, the portion of the nucleotide sequence between the 
3 1 end of the cpsA gene and the 5' end of the cpsB gene which is analysed is any 
nucleotide which is polymorphic between at least some of the S. pneumoniae serotypes 
25 referred to in Figure 2. 

In a particularly preferred embodiment, the method comprises amplifying at 
least a portion of the nucleotide sequence between the 3* end of the cpsA gene and the 
5' end of the cpsB gene, and sequencing the amplification product. More preferably, 
the entire approximate 800 bp region as provided in Figure 2 is amplified and 
30 sequenced. 

In the case of sequencing to identify the serotype, the sequencing primers are 
selected such that they hybridise specifically to a region within or near to a region 
within which a polymorphism is present. The primers need not be specific to particular 
serotypes since it is the actual sequence information obtained during the sequencing 
35 process which is used to determine the S. pneumoniae serotype. Thus the primers may 
hybridise specifically to genomic DNA from all 5. pneumoniae serotypes (or at least 
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those serotypes referred to in Figure 2), or to genomic DNA from some, but not all, S. 
pneumoniae serotypes. 

When a portion of the nucleotide sequence between the 3' end of the cpsA gene 
and the 5' end of the cpsB gene is amplified, it is preferable that the amplification is 
5 performed using primer pairs comprising a sequence selected from the group consisting 
of: 

1) GGCATT(/C)TATGGAGTTGATTCG(/A)TCCATT(/C)CACAC(C/T)TTAG 

and 

GC(/T)TCAATG(/A)TGG(/A)GCAATG(/T)ACTGGA(/C)GTA(/G)ATTCCCA(/G) 
10 CATC, 

2) GGCATT(/C)TATGGAGTTGATTCG(/A)TCCATT(/C)CACACC(/T) 
TTAG and CCATCAC(/T)ATAGAGGTTAC(/A)TG(/A)TCTGGCATT(/C)GC, and 

3) GAAAGTGGG(/A/T)GGG(/A/T)A(/G^ 
AATTCT(/G)CAAGAT(/C)TTA(/G)AAA(/G)G and 

15 T(/G)CATG(/A)CTA(/G)A^ 
T). 

In an alternate embodiment, the nucleotide sequence analysis step comprises 
determining whether a polynucleotide obtained from S. pneumoniae selectively 
hybridises to a polynucleotide probe comprising one or more polymorphic regions of 

20 the nucleotide sequence between the 3' end of the cpsA gene and the 5' end of the cpsB 
gene, wherein such polymorphic regions are shown in Figure 2. More preferably, the 
nucleotide sequence analysis step comprises a plurality of said polynucleotide probes. 
In a particularly preferred embodiment, where hybridisation to a plurality of probes is 
used as a means of analysis, the plurality of polynucleotide probes are present as a 

25 microarray. 

It has been noted that the method of the first aspect does not enable the 
identification of all known S, pneumoniae serotypes, for example shared sequences 
were noted in the following cases; 6 A with 6B, 10A and 23 A with 23F, 15B with 22F 
and 17F with 35B. Accordingly, in these instances further analysis will need to be 

30 performed to determine the correct serotype. To this end, the present inventors have 
discovered that polymorphisms in the wzy and/or wzx genes can also be useful for 5. 
pneumoniae serotyping. 

Accordingly, in a second aspect the present invention provides a method of 
determining the serotype of Streptococcus pneumoniae in a sample, the method 

35 comprising analysing at least a portion the wzy and/or wzx gene(s). 
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In a preferred embodiment, the method of the second aspect comprises 
amplifying at least a portion of the wzy and/or wzx gene(s), and determining the length 
of the amplification product. 

In a particularly preferred embodiment, at least a portion of the wzy and/or wzx 
5 gene(s) is amplified using primer pairs comprising a sequence selected from the group 
consisting of: 

1) GTAGGTGTAGTTTTTTCAGGGACTTTAATTTTATGCAGTG and 
TCGCTTAACACAATGGCTTTAGAAGGTAGAG, 

2) GTTATTTTATTTTTTTTGTCGGCATTGTATTCTTTATATCG and 
10 CAAATTCATCGTTTGTATCCATTTAACTGCATC, 

3) CTTATATCTAATTATGTTCCGTCTATATTTATATGGGTTTGCTTTC 
and TTTCTCTTCATTTTCCTGATAATTTTGTACTTCTGAATG, 

4) ATGCTTTTAAATTTCTTATTCATATCTATTTTTC and 
GTAAACAGAGAGCGAGTGATCATTTTAAAACTTTTGG, 

15 5) G(/A)GATTTT(/G)TTTCAACCT(/C)GCAGTAATTTTAACAA(/C)TC(/T) 

G(/A) and 
CCTGAAAACAA(/G)TACT(/C)ACTTTCTGAATTTCAC(/T)GGA(/G)TATAAAG, 

6) GTTTTATTGACTTTAAAGATGTTAGTTTCTTCGATTCCAG and 
TTTTTATTACTCTTCTTAAATCATAATGAATCGTACCAATCAAC, 

20 7) GGATCAATGGCAACTATATTTACCCTACTCTCCACAG and 

GAGTCGAAACCAACCGGAAAAAGCAATTGAG, 

8) CCTTTGGTTTATTATCCTACTTCCAAAACAGTTTATGC and 
CATATATCTCTTTATCCTGTCAATATTGATTGGCATTTTC, 

9) GATATTAGCTATACCAACAATTGTTCTTTTCCTGTACTCAGTC and 
25 GCATTTCTAGTACCGAACCATTGAAACTATCATCTG, 

1 0) GAAATT ATAGTCGG AGCTTTCATTT AT ATTAGTTTACTGGTTCTG 
and CAGAATAAAGAGAGCTGTAATAGGTGCAACTTCATGC, 

11) CTGTAATGTTTCTAATTAGTTCAGTATTTGCACTGGTTAATTC and 
CCCGTATATCCATTACTAAGAACAAGGTTGTATATTTCCTTC, 

30 12) GTTTCTCATTAGTTCTGTATTTGCCCTTATTAATGTGC and 

CCATGGCTAAGTGCAAGATTATGAATCTCTGTC, 

13) GTTTCTTATGTTTACCCTCAGCTTATATTGGCACAG and 
GATACCACAAATCTCCGAATTCTCTTAAAATAGATGG, 

14) TTAAGTAGTTCACAAGTGATAGTGAACTTGGGATTGTC and 
35 CACTGAGATTATTTATTAGCTTTATCGGTAAGGTGGATAAG, 

1 5) ATTACTTGTAATACTATGT ATTC AACT AGTCA(/C)AGGATTTGAT 
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GG and GAAC AAATTTCCGTATCAGATTTGCGATTTC, 

16) CCAATGAAAAGGAAAGTTCAATGTGTTTTGTTTCTGC and 
GGTGCTTCAGCAAAAATCCCCGTATTTCTTATCAG, 

17) TAGCTGATGTTCCGATAAATTATGGTGGGGTAATAATAG and 
5 CTGCGAC ACTGT AT AT ACCTAC ATTAT AACT ACT AGAC ATTTGC, 

18) GCAACTTTGGTTCTAAAATTTTAGTCTTTTTAATGGTTCC and 
TGTTAAACCCCAATATAGAAATTGTATTGAGAATAGCAGC, 

19) CGTTAATAGCTTATGTTCAACTGGTGATTGATTTTGG and 
TGATAGTTTTAGAAATAATATAAGGAATTGCAACTGCATGC, 

10 20) TTCATGTC(/T)T(/C)TTTTG(/A)TCTAATCTGATTACAATTG(/C) . 

TC(/T)A CAT CG(/A) and 

T(/C)GCATTTG(yT)GATCTGTCACAA(/G)TCAATAAGTTAAAACC, 

21) GGTAGGTATTTTAATTGGAGGAAGAGAGTCTTGAATGG and 

ATCTTCCCTTCATAAATTGACATAGGAAAAATAAGAGCC, 
15 22) CAATTCTAACTATGTCCAGTTTTATTTTTCCACTCATCAG and 

GACGTGATAATAATAAGCTGCCATTCCTGTCTAAAACG, 

23) CGGCGGTATTAAGTAGAATATTAACACCTGAAGAGTATGGC and 
GGCAATCAGACTCAATAAGTTCATCCGTTTAAAGTTC, 

24) GGTATTGCCTTTCCTTTGATAACTTCTCCTTATTTATCAC and 
20 TGAACTTGTAACTCGACACCCAAAAATATAAATAAATGAG, 

25) GAATCGGACAATAGCACAGGTACGAACAAG and 
GCCATGTAATCAACTGAGCAAGCAGGGTACTC, 

26) CAAAGGAACGTTATCAGCAATTGTGTCAAATTTCAG and 
AAGATTAGGGCGCACAAAGTTTACTTGTTTTAGC, 

25 27) GTTATTTCTTCAAATCTGCTCATAGTTTTAACCTCATCAC and 

TATCTTGCGTTTTCATCCCTTACAGTTATTAGGTTCAAAG, 

28) TTCTTCAAATCTTTTGACAGTCTTGACCTCTTCCTTG and 
TATCGTGCATTCGAATCTGTTACAGCTAATACATTTAAAC, 

29) GTCCTGACGCTATCAAATATCATTTTCCCATTAATCAC and 
30 CCCACATGTGATCAATAGGAGTGAAAATTCTCTATTC, 

30) GCTTTGGCTAACTTTTCATCAAAGATTTTAATTTTm and 
CCAGAGATAGCTGTAACACCAATTTTATCAATTCCCTTAG, 

31) CCTTTGGCTAATTTCTTGGACGATAATGAATTTGTATATG and 
CCACAAACATTAGCAATAAAGAAACCTAACAATCCC, 

35 32) GATCATACTCCCTATCATTACGACTCCCTATGTAACG and 

CC AAGAAATATCCAAACCTTTTGACACTAAACTT AATCC, and . 
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33) GTTGTTTTAGCTCAAGGAGGGATAATGTTGGCTTCG and 
GCTGATTTTACAAATAGGAAAATAGAGATTGCACCAAC. 

Guidance regarding the serotypes these primer pairs target, and the length of 
resulting amplification products, is provided in Tables 2 and 3. 
5 It has been noted that some of the above primer pairs formed non-serotype 

specific amplicons, for example; PCR targeting serotype 6B also amplified 6A; PCR 
targeting 18C amplified all serotypes in serogroup 18; PCR targeting wzx (but not wzy) 
of serotype 23F, amplified three serotype 23A strains; PCR targeting wzx and wzy of 
serotypes 33/37 amplified a 33A isolate and that targeting wzx amplified a serotype 
10 33B isolate. Accordingly, in these instances further analysis will need to be performed 
to determine the correct serotype. For instance, traditional serological typing can be 
performed. 

As the skilled addressee would be aware, serotype 3 does not contain wzy and 
wzx genes. Accordingly, upon obtaining results using the methods of the first aspect, 
15 the presence of serotype 3 can be confirmed by analysing the orf2 (wze)-cap3A-cap3B 
region. Preferably, serotype 3 is identified by amplifying a portion of the orf2 (wze> 
cap3 A-cap3B region using primer pairs selected from the group consisting of: 

1) GCACAAAAAAAAGTTTGATATTCCCCTTGACAATAG and 
GCAGGATCTAAGGAGGCTTCAAGATTCAACTC, and 
20 2) CGAACCTACTATTGAGTGTGATACTTTTATGGGATACAGAG and 

CTGACAGCATGAAAATATATAACCGCCCAACGAATAAG. 

During routine analysis of a sample comprising bacteria it will typically be 
desirable to ensure that the sample being analysed actually contains Streptococcus 
pneumoniae. Thus, it is preferred that the methods of the present invention include 
25 detecting any serotype of Streptococcus pneumoniae in the sample. 

Such methods are known in the art and include, but are not limited to, 
amplifying portions of the psaA and/or pneumolysin genes followed by detection of the 
amplification products. 

In a preferred embodiment, a portion of the psaA gene is amplified using 
30 primers comprising the sequence 

TACATTACTCGTTCTCTTTCTTTCTGCAATCATTCTTG and 
TAGTAGCTGTCGCCTTCTTTACCTTGTTCTGC. In another preferred 

embodiment, a portion of the pneumolysin gene is amplified using primers comprising 
the sequence AGAATAATCCCACTCTTCTTGCGGTTGA and 

35 CATGCTGTGAGCCGTTATTTTTTCATACTG. 
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The present inventors have observed a strong correlation between the molecular 
typing techniques of the first and second aspect and the actual serotype of a strain as 
determined by traditional antibody based serological typing. However, the typing 
methods of the invention may be assisted by further serotyping the S. pneumoniae 
5 strain. For instance, to ensure recombination events have not occurred, upon typing 
with the methods of the invention the serotype can be confirmed by serologically 
typing for the strain suggested by the methods of the invention. Furthermore, the 
inventors have noted that a few serotypes are difficult to resolve using the methods of 
the invention. These serotypes include 6 A and 6B; 10A, 23F and 23 A; 15B and 22F; and 
10 17F and 35B. Upon identification of any of these serotypes by the molecular techniques 
of the invention the serotype can be unequivocally typed using traditional serological 
methods. 

In a third aspect, the present invention provides a polynucleotide comprising a 
sequence selected from those provided in Figures 2 to 64, or a fragment thereof which 
15 is at least 10 nucleotides in length, with the proviso the polynucleotide does not 
comprise the 3' end of the cpsA gene to the 5' end of the cpsB gene of a S. pneumoniae 
serotype selected from the group consisting of: 1, 2, 3, 4, 6 A, 6B, 8, 9V, 14, 18C, 19F, 
19 A, 19B, 23F, 33F and 37, with the further proviso that the polynucleotide does not 
comprise the entire wzy and/or wzx gene(s) of a S. pneumoniae serotype selected from 
20 the group consisting of: 1, 2, 4, 6A, 6B, 8, 9V, 14, 18C, 19F, 19A, 19B, 23F, 33F and 
37, or the entire wzx gene of S. pneumoniae serotype 19C. 

In a preferred embodiment, the polynucleotide of the third aspect is at least 15 
nucleotides, more preferably at least 20 nucleotides, more preferably at least 25 
nucleotides, more preferably at least 30 nucleotides, more preferably at least 50 
25 nucleotides in length, and even more preferably at least 100 nucleotides in length. 

In a fourth aspect, the present invention provides a polynucleotide consisting 
essentially of 10 to 50 contiguous nucleotides corresponding to a portion of the 3 1 end 
of the cpsA S. pneumoniae gene or the 5' end of the cpsB S. pneumoniae gene, wherein 
said polynucleotide comprises one or more nucleotides which differ between different 
30 S. pneumoniae serotypes. 

Polynucleotides of the fourth aspect can be used as amplification primers, or as 
probes, for the identification of different S. pneumoniae serotypes. 

Preferably the nucleotides which differ between S. pneumoniae serotypes 
correspond to one or more of positions as shown in Figure 2. 
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Preferably, the polynucleotide of the fourth aspect is detectably labelled. The 
label can be any suitable label known in the art including, but not limited to, 
radionuclides, enzymes, fluorescent, and chemiluminescent labels. 

In a fifth aspect, the present invention provides a polynucleotide consisting 
5 essentially of 10 to 50 contiguous nucleotides corresponding to a portion of the S. 
pneumoniae wzy and/or wzx gene(s), wherein said polynucleotide comprises one or 
more nucleotides which differ between different S. pneumoniae serotypes. 

In a sixth aspect the present invention provides a composition comprising a 
plurality of polynucleotides according to the invention. Preferably, the composition 
10 further comprises a carrier or excipient. Preferably, the carrier or excipient is water or 
a suitable buffer. The composition may be used in methods of typing different S. 
pneumoniae serotypes. 

In a seventh aspect the present invention provides a microarray comprising a 
plurality of polynucleotides according to the invention. The microarray may be used in 
15 methods of typing different S. pneumoniae serotypes. 

In an eighth aspect, the present invention provides a kit comprising at least one 
polynucleotide of the present. 

Preferably, the polynucleotide is in accordance with the fourth of fifth aspects of 
the invention. In one embodiment, the kit further comprises reagents necessary for 
20 nucleic acid amplification. In another embodiment, the polynucleotide of the fourth or 
fifth aspect are detectably labelled and the kit further comprises means for detecting the 
labelled polynucleotide. 

As will be apparent, preferred features and characteristics of one aspect of the 
invention are applicable to many other aspects of the invention. 
25 Throughout this specification the word "comprise", or variations such as 

"comprises" or "comprising", will be understood to imply the inclusion of a stated 
element, integer or step, or group of elements, integers or steps, but not the exclusion of 
any other element, integer or step, or group of elements, integers or steps. 

The invention is hereinafter described by way of the following non-limiting 
30 examples and with reference to the accompanying figures. 

BRIEF DESCRIPTION OF THE ACCOMPANYING DRAWINGS 

Figure 1 . The genomic sequence of cpsA (wzg) and cpsB (wzh) genes of serotype 4 of 
S. pneumoniae as published by Jiang et al. (2001) and deposited as GenBank Accession 
35 Number AF3 16639. The remaining 3* sequence of GenBank Accession Number 
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AF3 16639 has not been provided. Nucleotides 1520 to 2965 encode cpsA whilst 
nucleotides 2967 to 3698 encode cpsB. 

Figure 2. Multiple sequence alignments for the region between the 3*-end of cpsA 
5 (wzg) and the 5*-end of cpsB (wzh) of 51 molecular capsular types (mct)/71 molecular 
capsular subtypes (mcst) of S. pneumoniae. The alignment numbering start point "1" 
refer to the position "2470" of S. pneumoniae serotype 4 cpsA (wzg) gene (GenBank 
accession number: AF3 16639) (Figure 1). 

10 Figure 3: Partial sequence of strain 00-251-3185 of S. pneumoniae wzx gene. 

Figure 4: Partial sequence of strain 01-122-0226 of S. pneumoniae wzx gene. 

Figure 5: Partial sequence of strain 01-192-2471 of S. pneumoniae wzx gene. 

15 

Figure 6: Partial sequence of strain MA055100 of S. pneumoniae wzx gene. 

Figure 7: Partial sequence of strain NZSPN01/329 of S. pneumoniae wzx gene. 

20 Figure 8: Partial sequence of strain 00-256-1986 of S. pneumoniae wzx gene. 

Figure 9: Partial sequence of strain NZSPN01/276 of S. pneumoniae wzx gene. 

Figure 10: Partial sequence of strain 00-201-1422 of S. pneumoniae wzx gene. 

25 

* Figure 1 1 : Partial sequence of strain 00-21 1-1669 of S. pneumoniae wzx gene. 

Figure 12: Partial sequence of strain 00S002 of S. pneumoniae wzx gene. 

30 Figure 13: Partial sequence of strain 00-251-3 185 of S. pneumoniae wzy gene. 

Figure 14: Partial sequence of strain 01 -1 22-0226 of S. pneumoniae wzy gene. 

Figure 15: Partial sequence of strain 01-192-2471 of S. pneumoniae wzy gene. 

35 

Figure 16: . Partial sequence of strain MA055100 of S. pneumoniae wzy gene. 
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Figure 17: Partial sequence of strain NZSPN01/329 of S. pneumoniae wzy gene. 

Figure 18: Partial sequence of strain 00-256-1986 of S. pneumoniae wzy gene. 

5 

Figure 19: Partial sequence of strain NZSPN01/276 of S. pneumoniae wzy gene. 

Figure 20: Partial sequence of strain 00-201-1422 of 5. pneumoniae wzy gene. 

10 Figure 2 1 : Partial sequence of strain 00-2 1 1 - 1 669 of S. pneumoniae wzy gene. 

Figure 22: Partial sequence of strain OOSQ02 of S. pneumoniae wzy gene. 

Figure 23: Partial sequence of strain NZSPN0 1/509 of 5. pneumoniae cpsl and wzx 
15 genes. 

Figure 24: Partial sequence of strain MA050408 of S. pneumoniae cpsl and wzx 
genes. 

20 Figure 25: Partial sequence of strain MA052433 of S. pneumoniae cpsl and wzx 
genes. 

Figure 26: Partial sequence of strain 00S009 of S. pneumoniae cpsl and wzx genes. 

25 Figure 27: Partial sequence of strain 99-325-0373 of S. pneumoniae cpsl and -wzx 
genes. 

Figure 28; Partial sequence of strain NZSPNOO/454 of S. pneumoniae cpsl and wzx 
genes. 

30 

Figure 29: Partial sequence of strain NZSPN00/484 of S. pneumoniae cpsl and wzx 
genes. 

Figure 30: Partial sequence of strain 00-081-2291 of S. pneumoniae wzy and wzx 
35 genes. 
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Figure 3 1 : Partial sequence of strain 00S 168 of 5. pneumoniae wzy and wzx genes. 

Figure 32: Partial sequence of strain 00-280-1493 of S. pneumoniae wzy and wzx 
genes. 

5 

Figure 33: Partial sequence of strain MA063073 of S. pneumoniae wzy and wzx 
genes. 

Figure 34: Partial sequence of strain NZSPN00/410 of S. pneumoniae wzy and wzx 
10 genes. 

Figure 35: Partial sequence of strain NZSPNO 1/243 of S. pneumoniae wzy and wzx 
genes. 

15 Figure 36: Partial sequence of strain MA063087 of S. pneumoniae wzy and wzx 
genes. 

Figure 37: Partial sequence of strain MA063207 of S. pneumoniae wzy and wzx 
genes. 

20 

Figure 38; Partial sequence of strain 01S333 of S. pneumoniae wzx gene. 

Figure 39: Partial sequence of strain MA050663 of S. pneumoniae wciW and wzx 
genes. 

25 

Figure 40: Partial sequence of strain 01 S3 19 of S. pneumoniae wciW and wzx genes. 

Figure 41 : Partial sequence of strain NZSPNOO/353 of £ pneumoniae wciW and wzx 
genes. 
30 . 

Figure 42: Partial sequence of strain MA062610 of S. pneumoniae wciW and wzx 
genes. 

Figure 43: Partial sequence of strain MA053392 of 5. pneumoniae wciW and wzx 
35 genes. 
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Figure 44: Partial sequence of strain NZSPN00/319 of S. pneumoniae wciW and wzx 
genes. . 

Figure 45: Partial sequence of strain NZSPN01/278 of £ pneumoniae wciW and wzx 
5 genes. 

Figure 46: Partial sequence of strain 01 S009 of S. pneumoniae wciW and wzx genes. 

Figure 47: Partial sequence of strain MA052628 of S. pneumoniae wciW and wzx 
10 genes. 

Figure 48: Partial sequence of strain 00-081-2291 of S. pneumoniae cpsJ and wzy 
genes. 

Partial sequence of strain 00-280-1493 of S. pneumoniae cpsJ and wzy 

Partial sequence of strain NZSPN00/410 of S. pneumoniae cpsJ and wzy 

Partial sequence of strain NZSPN01/243 of S. pneumoniae cpsJ and wzy 

Figure 52: Partial sequence of strain MA063073 of S. pneumoniae cpsJ and wzy 
25 genes. 

Figure 53: Partial sequence of strain 00S168 of S. pneumoniae cpsJ and wzy genes. 

Figure 54: Partial sequence of strain MA063087 of S. pneumoniae cpsJ and wzy 
30 genes. 

Figure 55: Partial sequence of strain MA063207 of S. pneumoniae cpsJ and wzy 
genes. 

35 Figure 56: Partial sequence of strain 01 S3 19 of 5. pneumoniae wzx and wzy genes. 



15 Figure 49: 
genes. 

Figure 50: 
genes. 

20 

Figure 51: 
genes. 
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Figure 57: Partial sequence of strain NZSPNOO/353 of S. pneumoniae wzx and wzy 
genes. 

Figure 58: Partial sequence of strain MA062610 of S. pneumoniae wzx and wzy 
5 genes. 

Figure 59: Partial sequence of strain MA053392 of S. pneumoniae wzx and wzy 
genes. 

10 Figure 60: Partial sequence of strain NZSPN00/319 of S. pneumoniae wzx and wzy 
genes. 

Figure 61: Partial sequence of strain NZSPN01/278 of S, pneumoniae wzx and wzy 
genes. 

15 

Figure 62: Partial sequence of strain MA050663 of S. pneumoniae wzx and wzy 
genes. 

Figure 63: Partial sequence of strain MA052628 of S. pneumoniae wzx and wzy 
20 genes. 

Figure 64: Partial sequence of strain 01 S009 of S. pneumoniae wzx and wzy genes. 

Figure 65: Phylogenetic tree inferred from sequences in the region between the3'-end 
25 of cpsA (wzg) and the 5'-end of cpsB (wzh) genes for 51 molecular capsular subtypes 
(mct)/71 molecular capsular subtypes (mcst) of S. pneumoniae. Most of the tree input 
sequences are from Figure 2 and Table 1; for GenBank accession numbers see Table 1. 
Sequences of two nonserotypable isolates were also included; they were clearly 
separated from the other known mct/mcst. 

30 

DETAILED DESCRIPTION OF THE INVENTION 
Definitions 

Unless defined otherwise, all technical and scientific terms used herein have the 
same meaning as commonly understood by one of ordinary skill in the art (e.g., in cell 
35 culture, molecular genetics, nucleic acid chemistry, hybridization techniques and 
biochemistry). 
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As used herein, the term "nucleotide sequence between the 3* end of the cpsA 
gene and the 5' end of the cpsB gene" at least refers to the region spanning from 
nucleotide 2470 to nucleotide 3268 of Figure 1. Figure 1 provides the genomic 
sequence of cpsA (wzg) and cpsB (wzft) genes of serotype 4 as published by Jiang et al. 
5 (2001) and submitted as GenBank Accession Number AF316639. As the skilled 
addressee would be aware, the same region from other serotypes of S. pneumoniae can 
be identified using standard techniques such as DNA cloning, sequencing and 
nucleotide sequence alignment. Such techniques are described in fiirther detail in the 
Examples section. In addition, these techniques have been used to determine the 
10 nucleotide sequence between the 3 f end of the cpsA gene and the 5* end of the cpsB 
gene from many different serotypes of S. pneumoniae, the results of which, including a 
consensus sequence for this region, are also provided in Figure 2. 

General Techniques 

15 Unless otherwise indicated, the recombinant DNA and immunological 

techniques utilized in the present invention are standard procedures, well known to 
those skilled in the art. Such techniques are described and explained throughout the 
literature in sources such as, J. Perbal, A Practical Guide to Molecular Cloning, John 
Wiley and Sons (1984), J. Sambrook et al., Molecular Cloning: A Laboratory Manual, 

20 Cold Spring Harbour Laboratory Press (1989), T.A. Brown (editor), Essential 
Molecular Biology: A Practical Approach, Volumes 1 and 2, ERL Press (1991), D.M. 
Glover and B.D. Hames (editors), DNA Cloning: A Practical Approach, Volumes 1-4, 
IRL Press (1995 and 1996), and F.M. Ausubel et al. (editors), Current Protocols in 
Molecular Biology, Greene Pub. Associates and Wiley-Interscience (1988, including 

25 all updates until present), Ed Harlow and David Lane (editors) Antibodies: A 
Laboratory Manual, Cold Spring Harbour Laboratory, (1988), and J.E. Coligan et al. 
. (editors) Current Protocols in Immunology, John Wiley & Sons (including all updates 
until present), and are incorporated herein by reference. 

30 Detection of Polymorphisms 

Any technique known in the art can be used to detect a polymorphism described 
herein. Examples of such techniques include, but are not limited to, sequencing of the 
DNA at one or more of the relevant positions; differential hybridisation of an 
oligonucleotide probe designed to hybridise at the relevant positions of a particular S. 

35 pneumoniae serotype(s); denaturing gel electrophoresis following digestion with an 
appropriate restriction enzyme, preferably following amplification of the relevant DNA 
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regions; SI nuclease sequence analysis; non-denaturing gel electrophoresis, preferably 
following amplification of the relevant DNA regions; conventional RFLP (restriction 
fragment length polymorphism) assays; selective DNA amplification using 
oligonucleotides which are matched for a particular S. pneumoniae serotype(s) 
5 unmatched for other S. pneumoniae serotype(s); or the selective introduction of a 
restriction site using a PGR (or similar) primer matched for a particular S. pneumoniae 
serotype(s), followed by a restriction digest. As outlined above, it is preferred that the 
nucleotide sequence between the 3' end of the cpsA gene and the 5* end of the cpsB 
gene is characterized by DNA sequencing, whilst the analysis.at least a portion the wzy 
10 and/or wzx gene is performed by procedures involving the detection of amplification 
products. 

PCR-based methods of detection may rely upon the use of primer pairs, at least 
one of which binds specifically to a region of interest in one or more, but not all, 
serotypes. Unless both primers bind, no PGR product will be obtained. Consequently, 

15 the presence or absence of a specific PCR product may be used to determine the 
presence of a sequence indicative of a specific S. pneumoniae serotype(s). However, as 
mentioned, only one primer need correspond to a region of heterogeneity in the 
genes/regions of interest. The other primer may bind to a conserved or heterogenous 
region within said gene/region or even a region within another part of the S. 

20 pneumoniae genome, whether said region is conserved or heterogeneous between 
serotypes. 

Alternatively, primers that bind to conserved regions of the S. pneumoniae 
genome but which flank a region whose length varies between serotypes may be used. 
In this case, a PCR product will always be obtained when S. pneumoniae bacteria are 
25 present but the size of the PCR product varies between serotypes. Examples of such 
varying amplification product lengths are disclosed herein in relation to the wzy and 
wzx genes. 

Furthermore, a combination of specific binding of one or both primers and 
variations in the length of PCR primer may be used as a means of identifying particular 
30 molecular serotypes. 

In some cases, PCR and other specific hybridisation- based serotyping methods 
will involve the use of nucleotide primers/probes which bind specifically to a region of 
the genome of a S. pneumoniae serotype which includes a nucleotide which varies 
between two or more serotypes. Thus the primers/probes may comprise a sequence 
35 which is complementary to one of such regions. Where positions of heterogeneity are 
close together (for instance within 5 or so nucleotides), it may be desirable to use a 
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primer/probe which hybridises specifically to a region of the S. pneumoniae genome 
that comprises two or more positions of heterogeneity. Such primers/probes are likely 
to have improved specificity and reduce the likelihood of false positives. 

PCR techniques that utilize fluorescent dyes may be used in the detection 
5 methods of the present invention. These include, but are not limited to, the following 
five techniques. 

i) Fluorescent dyes can be used to detect specific PCR amplified double 
stranded DNA product (e.g. ethidium bromide, or SYBR Green I). 

ii) The 5* nuclease (TaqMan) assay can be used which utilizes a specially 
10 constructed primer whose fluorescence is quenched until it is released by the nuclease 

activity of the Taq DNA polymerase during extension of the PCR product. 

iii) Assays based on Molecular Beacon technology can be used which rely on a 
specially constructed oligonucleotide that when self-hybridized quenches fluorescence 
(fluorescent dye and quencher molecule are adjacent). Upon hybridization to a specific 

15 amplified PCR product, fluorescence is increased due to separation of the quencher 
from the fluorescent molecule. 

iv) Assays based on Amplifluor (Intergen) technology can be used which utilize 
specially prepared primers, where again fluorescence is quenched due to self- 
hybridization. In this case, fluorescence is released during PCR amplification by 

20 extension through the primer sequence, which results in the separation of fluorescent 
and quencher molecules. 

v) Assays that rely on an increase in fluorescence resonance energy transfer can 
be used which utilize two specially designed adjacent primers, which have different 
fluorochromes on their ends. When these primers anneal to a specific PCR amplified 

25 product, the two fluorochromes are brought together. The excitation of one 
fluorochrome results in an increase in fluorescence of the other fluorochrome. 

Probes and primers may be fragments of DNA isolated from nature or may be 
synthetic. In one embodiment, primers/probes have a high melting temperature of 
>70°C so that they may be used in rapid cycle PCR. Preferably, the primers/probes 

30 comprise at least 10, 15 or 20 nucleotides. Typically, primers/probes consist of fewer 
than 50 or 30 nucleotides. Primers/probes are generally polynucleotides comprising 
deoxynucieotides. They may also be polynucleotides which include within them 
synthetic or modified nucleotides. A number of different types of modification to 
oligonucleotides are known in the art. These include methylphosphonate and 

35 phosphorothioate backbones, addition of acridine or polylysine chains at the 3* and/or 5 1 
ends of the molecule. For the purposes of the present invention, it is to be understood 
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that the polynucleotides described herein may be modified by any method available in 
the art. Primers/probes may be labelled with any suitable detectable label such as 
radioactive atoms, fluorescent molecules or biotin. 

The primers be synthesized using techniques which are well known in the art. 
5 Generally, the primers can be made using synthesizing machines which are 
commercially available. 

If required, in order to facilitate subsequent cloning of amplified sequences, 
primers may have restriction enzyme sites appended to their 5* ends. Thus, all 
nucleotides of the primers are derived from the gene sequence of interest or sequences 
10 adjacent to that gene except the few nucleotides necessary to form a restriction enzyme 
site. Such enzymes and sites are well known in the art. 

A sample to be typed for the presence and/or identification of a S. pneumoniae 
serotype may be from a bacterial culture or a clinical sample from a patient, typically a 
human patient. Clinical samples may be cultured to produce a bacterial culture. 
15 However, it is also possible to test clinical samples directly with a culturing step. 

The methods of the present invention can be used in a multi-step serotyping 
strategy. An example of such a multi-step serotyping strategy (algorithm) is shown in 
Table 6. However, a variety of other strategies are envisaged and can be designed by 
the skilled person using the sequence heterogeneity information presented herein. In 
20 particular, it is preferred that the serotyping procedure comprise at least one analysis 
step based on analysing one or regions between the 3 ! end of the cpsA gene and the 5' 
end of the cpsB gene. This analysis may optionally be combined with an analysis of 
one or more regions within the wzy and/or wzx genes. 

25 Microarravs 

Analysis of S. pneumoniae genomic sequences using the above techniques may 
take place in solution followed by standard resolution using methods such as gel 
electrophoresis. However in a preferred aspect of the invention, the primers/probes are 
immobilised onto a solid substrate to form arrays. 

30 The polynucleotide probes are typically immobilised onto or in discrete regions 

of a solid substrate. The substrate may be porous to allow immobilisation within the 
substrate or substantially non-porous, in which case the probes are typically 
immobilised on the surface of the substrate. Examples of suitable solid substrates 
include flat glass (such as borosilicate glass), silicon wafers, mica, ceramics and 

35 organic polymers such as plastics, including polystyrene and polymethacrylate. It may 
also be possible to use semi-permeable membranes such as nitrocellulose or nylon 
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membranes, which are widely available. The semi-permeable membranes may be 
mounted on a more robust solid surface such as glass. The surfaces may optionally be 
coated with a layer of metal, such as gold, platinum or other transition metal, 

Preferably, the solid substrate is generally a material having a rigid or semi-rigid 
5 surface. In preferred embodiments, at least one surface of the substrate will be 
substantially flat, although in some embodiments it may be desirable to physically 
separate synthesis regions for different polymers with, for example, raised regions or 
etched trenches. It is also preferred that the solid substrate is suitable for the high 
density application of DNA sequences in discrete areas of typically from 50 to 100 nm, 

10 giving a density of *1 0000 to 40000 cm" 2 . 

The solid substrate is conveniently divided up into sections. This may be 
achieved by techniques such as photoetching, or by the application of hydrophobic 
inks, for example teflon-based inks (Cel-line, USA). Discrete positions, in which each 
different probes are located may have any convenient shape, e.g., circular, rectangular, 

15 elliptical, wedge-shaped, etc. 

Attachment of the library sequences to the substrate may be by covalent or non- 
covalent means. The library sequences may be attached to the substrate via a layer of 
molecules to which the library sequences bind. For example, the probes may be 
labelled with biotin and the substrate coated with avidin and/or streptavidin. A 

20 convenient feature of using biotinylated probes is that the efficiency of coupling to the 
solid substrate can be determined easily! Since the polynucleotide probes may bind 
only poorly to some solid substrates, it is often necessary to provide a chemical 
interface between the solid substrate (such as in the case of glass) and the probes. 
Thus, the surface of the substrate may be prepared by, for example, coating with a 

25 chemical that increases or decreases the hydrophobicity or coating with a chemical that 
allows covalent linkage of the polynucleotide probes. Some chemical coatings may 
both alter the hydrophobicity and allow covalent linkage. Hydrophobicity on a solid 
substrate may readily be increased by silane treatment or other treatments known in the 
art. Examples of suitable chemical coatings include polylysine and 

30 poly(ethyleneimine). Further details of methods for the attachment of are provided in 
US 6,248,521. 

Techniques for producing immobilised arrays of nucleic acid molecules have been 
described in the art. A useful review is provided in Schena et al % (1998), which also 
gives references for the techniques described therein. 
35 Microarray-manufacturing technologies fall into two main categories — synthesis 

and delivery. In the synthesis approaches, microarrays are prepared in a stepwise 
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fashion by the in situ synthesis of nucleic acids from biochemical building blocks. With 
each round of synthesis, nucleotides are added to growing chains until the desired 
length is achieved. A number of prior art methods describe how to synthesise single- 
stranded nucleic acid molecule libraries in situ, using for example masking techniques 

5 (photolithography) to build up various permutations of sequences at the various discrete 
positions on the solid substrate. US 5,837,832 describes an improved method for 
producing DNA arrays immobilised to silicon substrates based on very large scale 
integration technology. In particular, U.S. Patent No. 5,837,832 describes a strategy 
called "tiling" to synthesize specific sets of probes at spatially-defined locations on a 

10 substrate which may be used to produced the immobilised DNA libraries of the present 
invention. US 5,837,832 also provides references for earlier techniques that may also be 
used. 

The delivery technologies, by contrast, use the exogenous deposition of 
prepared biochemical substances for chip fabrication. For example, DNA may also be 

15 printed directly onto the substrate using for example robotic devices equipped with 
either pins (mechanical microspotting) or piezo electric devices (ink jetting). In 
mechanical microspotting, a biochemical sample is loaded into a spotting pin by 
capillary action, and a small volume is transferred to a solid surface by physical contact 
between the pin and the solid substrate. After the first spotting cycle, the pin is washed 

20 and a second sample is loaded and. deposited to an adjacent address. Robotic control 
systems and multiplexed printheads allow automated microarray fabrication. Ink jetting 
involves loading a biochemical sample, such as a polynucleotide into a miniature 
nozzle equipped with a piezoelectric fitting and an electrical current is used to expel a 
precise amount of liquid from the jet onto the substrate. After the first jetting step, the 

25 jet is washed and a second sample is loaded and deposited to an adjacent address. A 
repeated series of cycles with multiple jets enables rapid microarray production. 

In one embodiment, the microarray is a high density array, comprising greater 
than about 50, preferably greater than about 100 or 200 different nucleic acid probes. 
Such high density probes comprise a probe density of greater than about 50, preferably 

30 greater than about 500, more preferably greater than about 1,000, most preferably 
greater than about 2,000 different nucleic acid probes per cm 2 . The array may further 
comprise mismatch control probes and/or reference probes (such as positive controls). 

Microarrays of the invention will typically comprise a plurality of 
primers/probes as described above. The primers/probes may be grouped on the array in 

35 any order. 
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Elements in an array may contain only one type of probe/primer or a number of 
different probes/primers. 

Detection of binding of S. pneumoniae DNA to immobilised probes/primers 
may be performed using a number of techniques. For example, the immobilised probes 
5 which are specific for one or a number of serotypes, may function as capture probes. 
Following binding of the genomic DNA to the array, the array is washed and incubated 
with one or more labelled detection probes which hybridise specifically to regions of 
the S. pneumoniae genome which are conserved (for example the S. pneumoniae psaA 
or pneumolysin probes/primers described herein could be utilized for this purpose). 
10 The binding of these detection probes may then be determined by detecting the 
presence of the label. For example, the label may be a fluorescent label and the array 
may be placed in an X-Y reader under a charge-coupled device (CCD) camera. 

Other techniques include labelling the genomic DNA prior to contact with the 
array (using nick-translation and labelled dNTPs for example). Binding of the genomic 
15 DNA can then be detected directly. 

It is also possible to employ a single PCR amplification step using labelled 
dNTPs. In this embodiment, the genomic DNA fragment binds to a first primer present 
in the array. The addition of polymerase, dNTPs, including some labelled dNTPs and a 
second primer results in synthesis of a PCR product incorporating labelled nucleotides. 
20 The labelled PCR fragment captured on the plate may then be detected. 

A number of available detection techniques do not require labels but instead rely 
on changes in mass upon ligand binding (e.g. surface plasmon resonance- SPR). The 
principles of SPR and the types of solid substrates required for use in SPR (e.g. 
BIACore chips) are described in Ausubel et al^ Short Protocols in Molecular Biology 
25 (1 999) 4 th Ed, John Wiley & Sons, Inc. 

Examples of the utilization of microarrays in genotyping include the use of 
microarrays to differentiate between closely related Cryptosporidium parvum isolates 
and Cryptosporidium species (Straub et al., 2002), and the use of microarrays to 
differentiate between species of Listeria (Volokhov et al, 2002). The detection 
30 principles applied in these studies can be used with the polymorphisms/primers/probes 
identified by the present inventors to identify different serotypes of S. pneumoniae in a 
sample. 

Kits 

35 In one embodiment, kits of the present invention include, in an amount 

sufficient for at least one assay, a polynucleotide probe of the invention which 
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preferentially hybridizes to a target nucleic acid sequence in a test sample under 
hybridization assay conditions. Kits containing multiple probes are also contemplated 
by the present invention where the multiple probes are designed to target different 
nucleic acid sequences from different S. pneumoniae serotypes and may include 
5 distinct labels which permit the probes to be differentially detected in a test sample. 
Kits according to the present invention may further comprise at least one of the 
following: (i) one or more amplification primers for amplifying a target sequence 
contained in or derived from the target nucleic acid; (ii) a capture probe for isolating 
and purifying target nucleic acid present in a test sample; and (iii) if a capture probe is 

10 included, a solid support material (e.g., magnetically responsive particles) for 
immobilizing the capture probe, either directly or indirectly, in a test sample. Kits of 
the present invention may further include one or more helper probes. 

Typically, the kits will also include instructions recorded in a tangible form 
(e.g., contained on paper or an electronic medium) for using the packaged 

15 polynucleotide in a detection assay for determining the presence or amount of a target 
nucleic acid sequence in a test sample. The assay described in the written instructions 
may include steps for isolating and purifying the target nucleic acid prior to detection 
with the polynucleotide probe, and/or amplifying a target sequence contained in the 
target nucleic acid. The instructions will typically indicate the reagents and/or 

20 concentrations of reagents and at least one assay method parameter which might be, for 
example, the relative amounts of reagents to use per amount of sample. In addition, 
such specifics as maintenance, time periods, temperature and buffer conditions may 
also be included. 

25 Uses 

As discussed above, S. pneumoniae is a leading cause of morbidity and mortality 
causing invasive disease such as meningitis and pneumonia as well as more localised 
disease such as acute otitis media and sinusitis. Continued surveillance is critical to monitor 
vaccine efficacy and changes in incidence and distribution of colonising and invasive 
30 serotypes. Any increase in disease caused by previously uncommon nonvaccine serotypes 
could necessitate a change in vaccine composition. Thus, the detection methods, 
probes/primer and microarrays of the invention may be used to monitor the epidemiology 
of invasive S. pneumoniae infections to assist in disease control and to inform vaccine 
policy. 

35 The molecular typing methods of the invention may also assist in 

comprehensive serotype identification that will be useful for epidemiological and other 
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related studies that will be needed to monitor & pneumoniae before and after 
introduction of S. pneumoniae vaccines. 

Examples 

5 MATERIALS AND METHODS 

Pneumococcal reference panels (Table 1) 

Reference panels 1-4, which consisted of 118 isolates, were kindly provided and 
serotyped by colleagues in Australia and Canada. All had been serotyped using the 
standard Quellung method and included all 23 serotypes represented in the 

10 polysaccharide vaccine, and 28 additional serotypes; there were multiple isolates of 40 
serotypes and five isolates that could not be serotyped with available antisera. • 
Reference panel 5 consisted of 21 invasive isolates from our diagnostic laboratory at 
the Centre for Infectious Diseases and Microbiology (CIDM), Sydney, for which 
serotypes were known at the beginning of the study. These five reference panels were 

15 used for the development and preliminary evaluation of MCT methods. Panels 2 and 4 
were tested by MCT, initially, without knowledge of the conventional serotyping (CS) 
results. 

Table 1. Conventional serotyping (CS) and molecular capsular typing (MCT) 



20 results of S. pneumoniae strains used in this study. 
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15B 


15B-C 




99-259-1456 


18C 


18C/18B 


18C 


00-273-2862 


4 


4 


4 


00-081-2291 


33F 


33F-g/33A 


33F/37 


00-118-2067 


5 


5-c 




01-175-0822 


7r 


•717 

/r 




00-324-0978 


a 


8 


8 


00-152-1664 


22F 


22F 




00-211-1414 


22F 


22F 




00-200-0078 


14 


14-g 


14 


00-118-0159 


19F 


19F 


19F 


00-310-1104 


4 


4 


4 


Clinical isolates 








New South Wales, 








(CIDM) 8 






6B 


01-192-3558 


6B 


6B-g 


01-192-2471 


6A 


6A-C 


6B 


01-192-1205 


6B 


6B-g 


6B 


01-191-1265 


14 


14-g 


14 


01-189-0296 


19F 


19F 


19F 


01-185-0511 


15B 


15B-22F 




01-184-0328 


8 


8 


8 


01-179-2448 


14 


14-g 


14 


01-178-0165 


14 


14-g 


14 


01-176-3302 


1 


1 


1 


01-173-2782 


4 


4 


4 


01-170-0873 


9V 


9V 


9V 


01-159-0505 


14 


14-g 


14 



AF532698; 
AY163172, AY163182 



AF532668 



AF532703; 
AY163178, AY163188 



AF532641 
AF532648 



AY163198, AY163216 
AF532696 



AF532699; 
AY163173, AY163183 



AF532650 



27 



01-157-3399 


4 


4 


4 


01-157-3394 


4 


4 


4 


01-157-2062 


4 


4 


4 


01-152-3295 


14 


14-g 


14 


01-150-3706 


14 


14-g 


14 


01-144-1862 


7F 


7F 




01-143-3353 


4 


4 


4 ' 


01-124-2300 


12F 


12F 




01-117-1910 


4 


4 


4 


01-096-2050a 


9V 


9V 


9V 


01-096-2050b 


9V 


9V 


9V 


01-096-2027 


9V 


9V 


9V 


01-077-1533 


7F 


7F 




01-075-3257 


9N 


9N 




01-058-3662 


14 


14-g 


14 


01-048-1320 


19A 


19A 


19A 


01-005-0764 


19F 


19F 


19F 


00-361-1217 


6B 


6B-q 


6B 


00-357-1164 


14 


1 A a 

14-g 


14. 


00-339-2918 


9N 


9N 




00-324-0977 


8 


8 


8 


00-315-2993 


23F 


23F-g= 


23F 






10A-23F 




00-315-2254 


23F 


23F-g= 


OOP 






10A-23F 




00-310-0630 


14 


14-g 


14 


00-303-0303 


19F 


19F 


19F 


00-293-1660 


19F 


19F 


19F 


00-280-1493 


33F 


33F-q 


33F/37 


00-267-0653 


8 


8 


8 


00-258-1120 


14 


14-g 


14 


00-257-0881 


9V 


9V 


9V 


00-256-1986 


6A 


6A-ca 


6B 


00-251-3185 


6A 


6A-6B-g= 


6B 






6B-g 




00-245-3950 


23F 


23F-g= 


23F 






10A-23F 




00-243-2229 


3 


o 
O 


Q 

o 


00-242-0394 


14 


14-g * 


14 


00-241-2964 


9V 


9V 


9V 


00-238-3448 


23F 


23F-g= 


23F 






10A-23F 




00-235-3584 


19F 


19F 


19F 


00-228-3777 


35B 


35B 




00-225-1482 


3 


3 


3 


00-225-0333 


19F 


19F 


19F 


00-217-3003 


4 


4 


4 


00-211-1669 


6B 


6B-C 


6B 


00-211-0475 


22F 


22F 




00-211-0469 


22F 


22F 




00-209-3409 


3 


3 


3 


00-208-0179 


4 


4 


4 



AF5 32650 



AY163200, AY163217 



AY163176, AY163186 

AF532700; 
AY163171, AY163181 



AF5 32665 



AF532704; 
AY163179, AY183189 



28 



00-200-1013 


14 


14-g 


14 


00-200-1012 


14 


14-g 


14 


00-199-0498 


4 


4 


4 


00-196-2923 


9V 


9V 


9V 


00-192-2087 


19A 


19A 


19A 


00-184-1203 


6B 


6B-q 


6B 


00-181-1568 


23F 


23F-g= 


23F 






10A-23F 




00-181-1567 


23F 


23F-g= 


23F 






10A-23F 




00-173-3686 


4 


4 


4 


00-164-1705 


6B 


6B-q 


6B 


00-163-1533 


14 


14-g 


14 


00-149-1265 


7F 


7F 




00-149-1264 


7F 


7F 




00-143-1473 


15B 


15B-22F 




00-138-3435 


3 


3 


3 


00-118-2891 


19F 


19F 


19F 


00-093-1315 


3 


3 


3 


00-078-0883 


14 


14-g 


14 


00-074-3370 


14 


14-g 


14 


00-070-0212 


23F 


23F-g= 


23F 






10A-23F 




00-066-3506 


4 


4 


4 


00-043-0876 


19A 


19A 


19A 


00-036-1378 


19F 


19F 


19F 


00-008-0865 


8 


8 


8 


99-348-3354 


6A 


6A-ca 


6B 


99-338-1052 


19F 


19F 


19F 


99-325-0373 


23F 


23F-G 


23F 


99-324-1010 


' 4 


4 


4 


99-404-0191 


4 


4 


4 


99-310-0070 


4 


4 


4 


99-302-1894 


9V 


9V 


9V 


99-293-1704 


19A 


19A 


19A 


99-287-2376 


35B 


35B 




99-287-2320 


35B 


35B 




99-287-2298 


35B 


35B 




99-284-1034 


14 


14-c 


. 14 


99-276-0568 


9V 


9V 


9V • 


99-242-0442A 


6B 


6B-q 


6B 


99-241-1187A 


4 


4 


4 


99-237-2839 


9V 


9V 


9V 


99-235-2193 


4 


4 


4 


99-226-10 26B 


7F 


7F 




99-221-2755 


9V 


9V 


9V 


99-221-2745A1 


23F 


23F-g» 


23F 






10A-23F 




99-221-0278 


4 


4 


4 


99-218-2527 


23F 


23F-g= 


23F 






. 10A-23F 




99-201-1708 


3 


3 


3 


99-196-2909B 


10A 


10A-23F 


23F-NEG 






=23F-p 





AF532681 



AF532678 



AF532645 



29 



99-196-2908B 


10A 


10A-23F= 
23F-g 


23F-NEG 


99-196-2882A 


10A 


10A-23F 
=23F-g 


23F-NEG 


99-196-2880A 


10A 


10A-23F 
=23F-g 


23F-NEG 


99-195-0430 


14 


14-g 


14 


99-193-2919A 


4 


4 


4 


99-193-29 18B 


4 


4 


4 


99-193-2747B 


4 


4 


4 


99-193-2491A 


18C 


18C/18B 


18C 


99-192-0047B 


23F 


23F-g= 
10A-23F 


23F 


99-188-2369A 


4 


4 


4 


99-186-2831 


7F 


7F . 




99-186-1038 


14 


14-g 


14 


99-188-0417 


14 


14-g 


14 


99-184-0894 


14 


14-g 


14 


99-182-1919 


4 


4 


4 


99-180-2653 


4 


4 


4 


99-178-0901 


14 


14-g 


14 


99-177-1060 


11A 


llA-q 




99-176-1983 


18C 


18C/18B 


18C 


99-173-2956 


4 


4 


4 


99-169-0432 


6B 


6B-g 


6B 


99-159-2018 


7F 


7F 




99-158-1250 


14 


14-g 


14 


99-157-0650 


19F 


19F 


19F 


* 99-146-2324 


19F 


19F 


19F 


99-144-1497 


22F 


22F 




99-134-2273 


3 


3 


3 


99-132-2724 


15B 


15B-q 




99-132-2558 


15B 


15B-q 




99-132-2557 


15B 


15B-q 




99-130-2037 


14 


14-g 


14 


99-110-2820 


9N 


9N 




99-108-0976 


23F 


23F-g= 
10A-23F 


23F 


99-107-0715 


14 


14-g 


14 


99-104-1860 


4 


4 


4 


99-099-0423 


19F 


19F 


19F 


99-095-1044 


20 


20/13 




99-091-2295 


23B 


23B 


23F-NEG 


99-090-2551 


14 


14-g 


14 


99-090-2390 


3 


3 


3 


99-090-2387 


3 


3 


3 


99-033-2630 


23F 


23F-g= 
10A-23F 


23F 


99-028-0057 


7C 


7C 




99-011-0311A 


4 


4 


4 


Clinical isolates 








New Zealand 








(ESR)° 








NZSPN00/9 


4 


4 


4 


NZSPN00/42 


18C 


18C/18B 


18C 



AF532676 



30 



NZSPN00/59 


5 


5-q 




NZSPNOO/87 


13 


13/20 




NZSPN00/88 


6B 


6B-g 


6B 


NZSPN00/91 


8 


8 


8 


NZSPN00/319 . 


18B 


18B/18C 


18C 


IN ^LiOriN U U/ o D D 


fr 


71? 




NZSPNOO/426 


3 


3 


3 


NZSPN0O/454 


23F 


23F-23A= 


23F 






23A-23F 




NZSPN0O/470 


9V 


9V 


9V 


NZSPNO 0/480 


6A 


6A-ca 


6B 


NZSPNOO/484 


23F 


23F-g= 


23F 






10A-23F 




NZSPNOO/499 


19F 


19F 


19F 


NZSPN01/162 


2 


2-q 


2 


NZSPNO 1/243 


. 33F 


33F-q 


33F/37 


NZSPNO 1/393 


35F 


35F 




NZSPN01/468 


11A 


llA-q 




NZSPN01/481 


16F 


16F 




NZSPN01/484 


23F 


23F-g= 


23F 






10A-23F 




NZSPN01/490 


22F 


22F 




NZSPN01/493 


9N 


9N 




NZSPN01/509 


23A 


23A-ca 


23F-X; 








23F-Y-NE 


NZSPN01/510 


12F 


12F 




NZSPN01/520 


9V 


9V 


9V 


NZSPN01/531 


8 


8 


8 


NZSPN01/534 


3 


3 


3 


NZSPN01/538 


38 


38/25F 




NZSPN01/543 


10A 


lOA-q 




NZSPN01/546 


4 


4 


4 


NZSPN01/547 


20 


20/13 




NZSPN01/548 


7F 


7F 




NZSPN01/549 


. 1 


1 


1 


NZSPN01/553 


17F 


17F-C 




NZSPN01/554 


19F 


19F 


19F 


NZSPNO 1/555 


18C 


18C/18B 


18C 


NZSPN01/557 


19A 


19A 


19A 


NZSPN01/559 


6A 


6A-c 


6B 


NZSPN01/560 


14 


14-g 


14 


NZSPN01/561 


6B 


6B-q 


6B 


NZSPN0O/12 


17F 


17F-C 




NZSPNOO/50 


Ncmserotypeable 


Nonserotypeable-nz 




NZSPNOO/59 


5 


5-q 




NZSPNOO/75 


Ncmserotypeable 


No-amplicon 




NZSPNOO/180 


9V+14 


9V 


9V+14 


NZSPNOO/221 


38 


38/25F 




NZSPNOO/225 


13 


13/20 




NZSPNOO/242 


35F 


35F 




NZSPNOO/353 


18A 


18A 


18C 



AY163212, AY163228 



AF532679 



AY163203, AY163219 



AF532714 



AF532659; 
AY163209, AY163225 



31 





33F 


33F-q 


33F/37 


Tvfy CTJMn 1/00 


•1 ftp 
lor 


lor 




NZSPN01/122 


10A 


10A-q 




NZSPN01/146 


38 


38/25F 




NZSPN01/166 


16F 


16F 




NZSPN01/204 


35B 


35B 




NZSPN01/209 


22A 


22A 




NZSPN01/240 


12F 


12F 




NZSPN01/254 


35F 


35F 




NZSPN01/262 


8 


8 


8 


NZSPN01/278 


6A 


6A-6B-q 


6B 






=6B-q 




NZSPN01/278 


18B 


18B/18C 


18C 


NZSPN01/291 


6B 


6B-q 


6B 


NZSFN01/303 


Nonserotypeable 


No-amplioon 


18C 


NZSPN01/313 


18C 


18C/18B 


NZSPN01/329 


6A 


6A-6B-g 


6B 






=6B-g 




NZSPNO 1/3 35 


19A 


19A 


19A 


NZSPN01/344 


18C 


18C/18B 


18C 


NZSPN01/361 


9N 


9N 




NZSPNOl/363 


18C 


18C/18B 


18C 


NZSPNO 1/366 


6A 


6A-ca 


6B 


NZSPN01/369 


18C 


18C/18B 


18C 


NZSPN01/374 


35B 


35B 




NZSPN01/387 


22F 


22F 




NZSPN01/388 


12F 


12F 




NZSPN01/389 


20 


20/13 




NZSPN01/403 


20 


20/13 




NZSPNO 1/411 


llA 


llA-nz 




NZSPN01/418 


8 


8 


8 


NZSPN01/428 


3 


3 


3 


NZSPN01/431 


1 


1 


1 


NZSPN01/437 


1 


1 


1 


NZSPN01/438 


22F 


22F 




NZSPN01/448 


11A 


llA-q 


19A 


NZSPN01/455 


19A 


19A 


NZSPN01/463 


10A 


lOA-q 




NZSPN01/465 


22F 


22F 


23F-NI 


NZSPN01/477 


10A 


10A-23F 






=23F-g 




NZSPNOl/478 


20 


20/13 




NZSPN01/483 


8 


8 


8 


NZSPN01/485 


12F 


12F 




NZSPN01/489 


3 


3 


3 


NZSPN01/497 


9N 


9N 




NZSPN01/505 


19A 


19A 


19A 


NZSPN01/512 


7F 


7F 




NZSPN01/515 


3 


3 


3 


NZSPN01/516 


1 


1 


1 


NZSPN01/529 


1 


.1 


1 


NZSPN01/532 


4 


4 


4 


NZSPN01/535 


7F 


7F 





AF532688; 
AY163202, AY163218 



AF532654 



AY163177, AY163187 
AY163213,AY163229 



AF532701; 
AY163175, AY163185 



AF532638 
AF532683 



32 



NZSPN01/539 
NZSPN01/545 
NZSPN01/556 
NZSPN01/558 



19F 
18C 
6B 
14 



19F 
18C/18B 
6B-q 

14 ~S 



19F 
18C 
6B 
14 



Notes. 



1. CS of selected S. pneumoniae isolates from reference panels 1 and 3 was repeated 
5 by Gail Stewart and Robert Gange at Department of Microbiology, Children's 

Hospital at Westmead, New South Wales, Australia. 

2. MCT was performed and GenBank accession numbers generated by Fanrong Kong 
at Centre for Infectious Diseases and Microbiology (CIDM), Institute of Clinical 
Pathology and Medical Research (ICPMR), Westmead Hospital, Westmead, New 

10 South Wales, Australia. See text for molecular capsular subtype (mctsp) 
nomenclature. 

3. Provided by Denise Murphy, Pneumococcal Reference Laboratory, Public Health 
Microbiology, Queensland Health Scientific Services, Queensland, Australia. 

4. Provided by Associate Professor Geoff Hogg and Jenny Davis, Microbiological 
15 Diagnostic Unit (MDU), Public Health Laboratory, Department of Microbiology 

and Immunology, University of Melbourne, Victoria, Australia. 

5. Provided by Dr. Louise P. Jette, Institut National de Sante Publique du Quebec- 
Laboratoire de Sante Publique du Quebec, Sainte-Anne-de-Bellevue, Quebec H9X 
3R5, Canada. 

20 6. Provided by Dr. Michael Watson, Department of Microbiology, Children's Hospital 
at Westmead, New South Wales, Australia. 
7. Selected 215. pneumoniae clinical isolates, of which CS results were known, from 
the CIDM diagnostic laboratory. 



33 



8. 1 52 Australian S. pneumoniae clinical isolates, of which CS results were known, 
from the CIDM diagnostic laboratory. 

9. 103 New Zealand S. pneumoniae clinical isolates Provided by Dr. Diana Martin, 
from Streptococcus Reference Laboratory, at Institute of Environmental Science 

5 and Research (ESR), Wellington, New Zealand. 

Clinical isolates 

179 consecutive S. pneumoniae clinical isolates from normally sterile sites, 
collected during the period January 1999 to June 2001, by the CIDM diagnostic 

10 laboratory, were studied; 21 were randomly selected to make up reference panel 5 (see 
above). Dr Diana Martin, Institute of Environmental Science and Research (ESR), 
Wellington, New Zealand provided 103 clinical isolates from diagnostic laboratories 
throughout New Zealand, Clinical isolates were initially tested using the MCT method, 
without knowledge of their CS results (single-blind study). Isolates were retrieved from 

15 storage by subculture on blood agar plates (Columbia II agar base supplemented with 
5% horse blood) and incubated overnight at 37°C CO2 incubator. 

Conventional serotvping (CS) 

CS was performed by the Quellung reaction using rabbit polyclonal antisera 

20 from the Statens Serum Institute, Copenhagen, Denmark (Sorensen, 1993). Briefly, 2 
\xL of a suspension of isolate, in 10% formalin saline, and 1 pL of antisera, under a 
glass coverslip were examined for capsular swelling using a light microscope at 400x 
magnification. Clinical isolates from CIDM were serotyped at Department of 
Microbiology, Children's Hospital at Westmead, Sydney, Australia and those from 

25 New Zealand by the Streptococcus Reference Laboratory, at ESR, Wellington, New 
Zealand. Selected New Zealand clinical isolates for which only serogroup results were 
available and selected isolates from reference panels 1 and 3 were re-tested at 
Children's Hospital at Westmead. 

30 Molecular capsular typing fMCT) - development of method 
Oligonucleotide primers 

The oligonucleotide primers used in this study, their target sites and melting 
temperatures are shown in Table 2 and the primer pair specificities and expected 
amplicon lengths in Table 3. Primers were designed with high melting temperatures to 

35 be used in rapid cycle PCR (Kong et al., 2000). 
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Table 3. Specificity and expected lengths of amplicons of primer pairs used 
in this study" 

Primer pairs 1 Specificity " ~ ~ Length of. amplicons (base pairs) 



P1/P2 


& pneumoniae 


AHA 
OU*± 


Ila/Iib 


u. pneumoniae 


924 


cpsSl/cpsA3 z 


.S. pneumoniae 


mm 


O * / A 4 2 

cpsSl/cpsAl 2 


o, pneumoniae 


^20 


On / A o2 

cpsS3/cpsA2* 


o. pneumonias 


503 


1YS/1YA 


serotype 1 


9Qfi 


2YS/2YA 


serotype 2 


•*4A 


4YS/4YA 


serotype 4 


Ow 


6A6BYS/6A6BYA 


serogroup 6 


OIK 

o ID 


6A6BYS0/6A6BYA1 2 ' 


serogroup 6 


747 


8YS/8YA 


serotype 8 


97*7 


9VYS/9VYA 


serotype 9V 


ooo 


14YS/14YA 


serotype 14 




18CYS/18CYA 


serogroup 18 




18CYS0/18CYA1 2 


serogroup 18 


671 


19FYS/19FYA 


serotype 19F 


286 


19AYS/19AYA 


serotype 19A 


. 270 


19BYS/19BYA 


serotype 19B 


42S 


23FYS/23FYA 


serotype 23F 


280 


33F37YS/33F37YA 


serotypes 33F/33A/37 


310 


33F37YS0/33F37YA1 3 


serotypes 33F/33A/37 


668 


1XS/1XA 


serotype 1 


426 



41 



2XS/2XA 


serotype 2 


429 


4XS/4XA 


serotype 4 


324 


6A6BXS/6A6BXA 


serogroup 6 


305 


6A6BXS0/6A6BXA1 2 


serogroup 6 


1102 


8XS/8XA 


serotype 8 


325 


9VXS/9VXA 


serotype 9V 


368 


14XS/14XA 


serotype 14 


289 


18CXS/18CXA 


serogroupl8 


368 


18CXS0/18CXA1 2 


serogroup 18 


721 


19FXS/19FXA 


serotype 19F 


305 


19AXS/19AXA 


serotype 19A 


300 


19BXS/19BXA • 


serotype 19B 


327 


23FXS/23FXA 


serotypes 23F/23A 


401 


23FXS0/23FXA1 2 


serotypes 23F/23A 


744 


33F37XS/33F37XA 


serogroups 33/37 


328 


33F37XS0/33F37XA1 2 


serotypes 33F/33A/37 


746 


3S1/3A1 


serotype 3 


321 


3S2/3A2 


serotype 3 


297 



Notes. 

1 . See Table 2 for primer sequences. 

2. For sequencing use only. 
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Four previously published S. pneumoniae-specific primers, targeting psaA (PI, 
P2) (Morrison et al., 2000) and pneumolysin (Ha, nb) (Salo et al., 1995) were modified 
to give high melting temperatures and used to confirm that isolates were S. 
pneumoniae. Primers were designed to amplify and sequence portion of the cpsA-cpsB 
5 gene region and to amplify serotype/serogroup-specific sequences in the wzy and wzx 
genes of 16 S. pneumoniae serotypes for which cps gene cluster sequences were 
available. In order to further explore the sequence heterogeneity, part of the wzx and 
wzy genes of isolates belonging to serogroups 6, 18, 23 and 33/37 were also sequenced. 
For serotype 3, which does not contain wzy and wzx genes, serotype-specific PCR 
10 targeted the orf2 (wze)-cap3A-cap3B region (Arrecubieta et al., 1996). 

DNA preparation, PCR and sequencing 

DNA extraction, PCR and sequencing were performed as previously described 
(Kong et al., 2002). 

15 

Sequence comparison, multiple sequence alignments, and phylogenetic analysis 

Sequences were compared using Bestfit in Comparison program group. Multiple 
sequence alignments were performed with Pileup and Pretty in Multiple Sequence 
Analysis program group. Phylogenetic relationships were studied using Ednadist and 
20 Ekitsch in Evolutionary Analysis program group. All programs are provided in 
WebANGIS, ANGIS (Australian National Genomic Information Service), 3 rd version. 

Nucleotide sequence accession numbers 

The new partial sequence data for cpsA-cpsB, wzy (polymerase) and wzx 

25 (flippase) genes for selected reference and clinical isolates reported in this paper have 
appeared in the GeriBank Nucleotide Sequence Databases, with accession numbers 
AF532632-AF532715, and AF163171-AF163232, respectively (Table 1). 

Previously reported sequence data used in this paper, in addition to those listed 
in Table 2, have appeared in GenBank Nucleotide Sequence Databases with the 

30 following accession numbers: U15171, U66846 and U66845 (cps gene cluster for 
serotype 3); NCJJ03028 (serotype 4 genome); AJ239004 (cps gene cluster for 
serotype 8); AF030367-AF030372 (cps gene cluster for serotype 19F); AF105113 
(partial cps gene cluster for serotype 19A); AF1051 14 and AF106137 (partial cps gene 
clusters for serotype 19B); AF105115 (partial cps gene clusters for serotype 19C); 

35 AF030373 and AF030374 (cps gene clusters for serotype 23F). 
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RESULTS 

Both pairs of S. pneumoniae species-specific primers (targeting psaA and 
pneumolysin genes) produced amplicons of the expected size from all reference and 
clinical isolates except six of 179 CBDM isolates, which, on retesting, were optochin 

5 resistant and therefore excluded from further study as they were not S. pneumoniae. 

The sequencing primers, cpsSl/c/?&43, formed amplicons from all but 13 
reference and clinical isolates. Of these 13 isolates, 10 (eight belonging to serotypes 
38/25F and two that were nonserotypable) formed amplicons with primer pairs 
cpsSVcpsAl and opsS3/cpsA2. Three nonserotypable isolates did not form amplicons 

10 using any of the primer pairs targeting the cpsA-cpsB region, although they had been 
confirmed to be S. pneumoniae using both species-specific PCR. 

Sequence heterogeneity in the region between the 3' -end of cpsA and the 5 ' -end of 

cpsB 

15 The present inventors sequenced and analyzed 800 bp fragments of the region 

between the 3'-end of cpsA (starting at base pair 951) and the 5 '-end of cpsB (see 
Figure 2). Representative sequences were deposited into GenBank (see Table 1 for 
accession numbers). There were 424 sites that were identical for all 51 serotypes 
represented among the isolates examined, leaving 376 (47%) heterogeneity sites. 

20 * 

Intra- and inter-serotype/subtype heterogeneity 

Only single isolates were available for 11 serotypes and the mixed serotype 
9V/14 (see below). Among 40 serotypes, for which multiple isolates were available, 14 
were divided into two or more subtypes, on the basis of major and/or stable intra- 

25 serotype heterogeneity. Molecular capsular subtypes (mcst) were named according to 
their conventional serotype (cs) and, generally, the source of the isolate in which the 
sequence difference was first identified [-g = Genbank sequence; -c (CDDM); -q 
(Queensland); - ca (Canada); -nz (New Zealand)]. When sequences characteristic of 
two serotypes were present in the cpsA-cpsB region subtype names included both, with 

30 the cs first (e.g mcst 23F-23A when cs was 23F; mcst 23A-23F when cs was 23A). 
Seventeen serotypes had no intra-serotype heterogeneity and in nine there were minor 
and/or less stable variations between isolates and/or between sequences disclosed 
herein with corresponding sequences in GenBank (Table 4, Figure 2). 

There were 368 heterogeneity sites that allowed differentiation between 

35 molecular capsular types (met) and subtypes (mcst), including both specific and shared 
sites (Table 4, Figure 2). 
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Phylogenetic tree based on region of the 3 '-end of cpsA -the 5 '-end of cpsB genes 

Using these 800bp sequences, a phylogenetic tree was inferred for the 71 S. 
pneumoniae met and mcst (Figure 65). S. pneumoniae can be divided into at least two 
classes, based on sequence analysis of the cps A-D region. Typical class I serotypes 
5 (e.g. 1, 18C, 19F), a typical class. II serotype (e.g 33F, represented by 33F-g) and a 
nontypical class II serotype (19 A) were each in different clusters of the tree (Jiang et 
al, 2001). 

The phylogenetic tree provides evidence for, and suggests possible sources of, 
recombination between cpsA-cpsB genes of classes I and II. For example, subtype 23F- 

10 c clustered with 15A-ca2, but in a separate cluster from other 23F and 15A subtypes, 
suggesting that they may have arisen by recombination* between 23F and 15 A, 
respectively, and other serotypes. Different subtypes of some other met were located in 
different clusters and appeared to be only distantly related to each other e.g. 33F-g and 
33F-q, 2~g and 2-q, 17F-C and 17F-35B. Sharing of identical sequences between 

15 otherwise unrelated serotype pairs also provides evidence of recombination (see 
above). 

Molecular capsular typing (MCT) based on cpsA-cpsB region sequences 

The met, assigned on the basis of cpsA-cpsB sequence, was the same as the cs 

20 for all isolates belonging to 36 of 51 serotypes (or 304 of 394 [77%] isolates), and for 
the majority of isolates (25 of 39) belonging to another five serotypes (Table 5). The 
remaining isolates in these serotypes shared sequences with other serotypes, namely 6A 
with 6B, 10A and 23 A with 23F, 15B with 22F and 17F with 35B, presumably as a 
result of recombination. There were five serotype pairs, represented by 46 isolates, 

25 whose members had identical sequences: namely 20/13, 18C/18B, 38/25F, 31/42 and 
33F-g/33A. 

MCT based on PCR targeting wzv and wzx (orf2 fwze ] -cap 3 A -cap3B for serotype 3^ 

There is significant sequence heterogeneity in wzy and wzx (data not shown), 

30 which made them suitable PCR targets for serogroup or serotype identification (Tables 
2 and 3). With few exceptions, primer pairs targeting these genes formed amplicons 
only from the corresponding serotypes represented in the five reference panels. 
Exceptions were: PCR targeting serotype 6B also amplified 6 A; PCR targeting 18C 
amplified all serotypes in serogroup 18; PCR targeting wzx (but not wzy) of serotype 

35 23F, amplified three serotype 23A strains; PCR targeting wzx and wzy of serotypes 
33/37 amplified a 33A isolate and that targeting wzx amplified a serotype 33B isolate. 



52 



The specificity of serotype 3-specific primers targeting the orf2 (wze)-cap3A- 
cap3B genes (Arrecubieta et al., 1996) was confirmed by production of an amplicon of 
the expected size from all 17 serotype 3 isolates. Thus, a serotype or serogroup was 
assigned by PCR to all 239 isolates belonging to serotypes/serogroups for which 
5 specific PCR was developed (Table 5). 

Comparison of MCT based on cpsA-cpsB sequencing and PCR/sequencing targeting 
wzx and wzy 

The results of PCR and cpsA-cpsB sequencing were consistent except that PCR 
10 could not distinguish between some members of serogroups 6, 18, 23 and 33/37 and 
further sequencing (of wzx, wzy) was required to identify individual mct/mcst (see 
below). The cpsA-cpsB sequences of six 10 A isolates were identical to those of 23F, 
but the isolates were negative in the 23F-specific PCR targeting wzx and wzy (mcst 
10A-23F). 

15 

Relationships within serogroups 

Sequence analysis of the cpsA-cpsB region and wzy and wzx genes (data not 
shown) showed variable phylogenetic relationships between members of different 
serogroups. 

20 

Serogroup 6 

Met 6A and 6B were divided into five and three subtypes, respectively, based on 
different sequence patterns in the cpsA-cpsB region. Three 6A isolates had sequences in 
this region characteristic of serotype 6B (Table 4). Serotypes 6 A and 6B could not be 
25 distinguished by PCR targeting wzx and wzy. Sequencing of these genes correctly , 
identified all except one 6A isolates, but some 6A and 6B subtypes share identical or 
very similar sequences. The serotype of the discrepant isolate (serotype 6A, mcst 6B-q) 
was checked independently by two laboratories (Vakevainen et al, 2001). 

30 Serogroup 18 

Met 18C and 18B had identical cspA-cpsB region sequences and were close to 
18 A and 18F in the class I cluster (Figure 65). PCR targeting both wzx and wzy genes 
amplified all four serotypes. Sequences of 18C and 18B were identical to each other, 
but different from those of serotypes 18A and 18F, which were also distinguishable 
35 from each other. 
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Serogroup 23 

Met 23F, 23 A (except most 23F-23 A and 23 A-23F) and 23B were separated into 
different clusters based on cpsA-cpsB sequence differences. Serotype 23A (including 
mcst 23A-23F) was identified on the basis of a positive result with 23F-specific primers 
5 targeting wzx and a negative result with the corresponding wzy PCR Sequencing could 
differentiate individual serotypes (23A, 23F and 23B) except mcst 23F-23A and 23A- 
23F. Mcst 23F-C, 23A-23F and 23F-23A have apparently arisen by recombination 
between 23F, 23 A and/or others, producing sequences in the cpsA-cpsB regions that are 
quite different from their parental types. 

10 

Serogroups 33 and 37 

Met 33A and 33F-g share identical cpsA-cpsB sequences and that of 33B is 
similar; 37 and 33F-g cluster together, as do 33B and 33F-q (Figure 65). The 33F/37- 
specific wzx PCR amplified 37, 33F, 33A and 33B, indicating similarities at that site, 
15 although sequencing showed clear differences between 33B and the others. The 
33F/37-specific wzy PCR amplified 37, 33F and 33A but not 33B. Thus, met 33B was 
identified on the basis of a positive result with 33F/37-specific primers targeting wzx 
and a negative result with the corresponding wzy PCR. 

20 Other serogroups 

Despite antigenic similarities that determine their membership of the same 
serogroup, met 9N and 9V appear to be genetically distant, on the basis of significant 
differences between their cpsA-cpsB sequences and the fact that 9V-specific PCR did 
not amplify 9N. 

25 Similarly, met 19F and 19A had quite different cpsA-cpsB region sequences and 

separated into different clusters. 19F-specific PCR did not amplify 19 A and vice versa. 
There were differences between met 19F, 19 A, 19B, 19C in wzx and wzy sequences 
(except wzy sequence of 19C was not available in GenBank), but they formed two 
groups - 19F, 19Aand 19B, 19C. 

30 Met 7F and 7C separated into different clusters based on cpsA-cpsB sequences, 

as did 11 A and 11B (Figure 65), Met 15B and 15C had similar cpsA-cpsB sequences 
and clustered together, except for mcst 15B-22F. Met 17F (including mcst 17F-C and 
17F-35B) and 17A were clustered together. Met 22F and 22 A can be distinguished on 
the basis of a single but very stable heterogeneity site. Met 35F and 35B are closely 

35 related based on similar cpsA-cpsB sequences. 
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Mixed culture 

One clinical isolate identified as serotype 9/14 using antisera was positive in 
9V- and 14-specific PCR (targeting both wzx and wzy\ but was identified as met 9V by 
sequencing. The isolate was subcultured and 16 individual colonies were rested. All 16 
5 colonies were positive in both met 9V-specific and negative in both met 14-specific 
PCR assays and were identified as met 9V by sequencing. The serotype of the original 
isolate was rechecked and the results (mixed serotype 9/14) were as before. It was 
therefore assumed that the original isolate was a mixture, predominantly of 
serotype/mct 9V with a minor component of serotype/mct 14. 

10 

Comparison of serotype identification results between MCT and CS 

After CS and MCT had been completed, the results were compared. Initial 
results were discrepant for 29 isolates; repeat serotyping and/or correction of clerical 
errors resolved all but five discrepancies. Final results correlated between CS and MCT 

15 methods for all isolates of 38 serotypes (318 isolates), 20 of 25 of another three 
serotypes, and all five nonserotypable isolates (total 343 isolates). In addition, there 
were 46 isolates belonging to pairs of serotypes whose members could not be 
distinguished from each other by MCT but all were assigned to the pair that included 
the serotype to which they had been assigned by CS. These results were classified as 

20 consistent. 

The five discrepant results were: one isolate of serotype 6A was identified as 
mcst 6B-q, two isolates of serotype 15B were identified as met 22F and two isolates of 
serotype 17F as met 35B. 

25 Algorithm for serotype assignment of S. pneumoniae by MCT 

An algorithm for practical use of the MCT method for the identification of S. 
pneumoniae serotypes is shown in Table 6. 

DISCUSSION 

30 Sequences of 16 cps gene clusters showed that all have the same four genes at 

their 5' ends - cpsA (wzg)-cpsB (wzh)-cpsC (wzd)-cpsD (wze) - which are the sites for 
recombination events that generate new forms of capsular polysaccharide. The 
sequences for different serotypes can be divided into two classes and show evidence of 
interesting recombination patterns. 

35 
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The study of 51 serotypes, of which 40 were represented by more than one 
isolate, showed that the cpsA-cpsB sequences for the same serotypes were generally 
stable or could be consistently divided into a small number of subtypes. This shows that 
sequence patterns in this region can be used to identify different 
5 serotypes/serosubtypes. 

It has been shown previously that PCR-RFLP based on the cpsA-cpsB region 
can predict S. pneumoniae serotypes (Lawrence et al., 2000). However, the method 
generates a long amplicon (1.8kbp), requires the use of three restriction enzymes and 
special equipment and has limited discriminatory ability. 

10 The present inventors identified 376 sequence heterogeneity sites, in the cpsA- 

cpsB region, among the 51 serotypes studied (Table 4, Figure 2), which allowed a 
practical MCT assay based on sequencing to be developed. Several pairs of primers 
were designed to amplify a 1001 bp segment within the cpsA-cpsB region, based on the 
following considerations. The primers formed amplicons from virtually all, S, 

15 pneumoniae isolates (>99% of those examined); the amplicon is small enough to be 
amplified using normal PCR protocols; the region of interest (800bp) can be sequenced 
using a single reaction and the method is objective. The target included most of the 
variable sites (bp 951 to 1747), providing maximum discrimination between closely 
related serotypes (e.g. members of serogroups 33 and 37 that could not be distinguished 

20 by serotype-specific PCR). 

Some of the 376 heterogeneity sites in the cpsA-cpsB region were specific for 
individual met or mcst (Table 4, Figure 2), while others were shared between several. 
Based on these patterns, plus PCR and selective sequencing of type-specific regions of 
wzx and wzy, most of the 51 serotypes represented among our isolates could be 

25 distinguished and further divide them into a total of 71 met and mcst, with the aid of 
sequence analysis software. The final CS and MCT results correlated for 343 isolates of 
389 (88%) for which results for both methods were available, including five that were 
nontypable by either method. For 46 isolates belonging to five serotype pairs, members 
of which could not be distinguished by sequencing, results were classified as consistent 

30 leaving unresolved discrepancies between methods for only five (1 .2%) isolates. 

Sequence analysis of the cps gene clusters of 16 serotypes showed that wzy 
(capsular polysccharide polymerase gene) and wzx (capsular polysccharide flippase 
gene) are highly variable, making them suitable targets for direct serotype identification 
by PCR. The present inventors designed serotype-specific PCR primers for these 

35 serotypes, targeting wzx and wzy and, for serotype 3, which has no wzy and wzx genes, 
targeting or/2 (wze)-cap3A- cap3B (Arrecubieta et al., 1996). It was found that 
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presumed serotype-specific primers for 6 A, 18C, 23F and 33F/37 were not serotype- 
specific, but amplified other related serotypes. To improve the MCT methods, portions 
of the wzy and wzx genes of serotypes within these groups were sequenced, which 
allowed met and, in some cases, mcst to be distinguished within these serogroups and 
5 demonstrate relationships between them. 

The present inventors have recognized that the large number of pneumococcal 
serotypes would make it impractical to use serotype-specific PCR for all of them. 
Nevertheless, wzy and wzx PCR can be used to resolve discrepancies between CS and 
cpsA-cpsB region sequencing assays e.g. for mcst 10A-23F and 23A-23F. Moreover, 

10 the use of two target regions in the cps gene cluster helps to clarify the relationships 
between mcst that have apparently arisen by recombination. Mct-specific primers were 
evaluated using three reference panels, which had been characterised by CS and used to 
identify clinical isolates of unknown cs. By PCR alone, 239 (61%) of our 394 clinical 
isolates were assigned to a serotype or serogroup (Table 5). This method can be 

15 extended to other met, when additional wzx and wzy sequences are available. 

In some circumstances, . sequencing of the cpsA-cpsB region may be more 
practical than type-specific PCR. For most serotypes only a single method and fewer 
primers (cpsSl/c/?&43-for most serbtypes/isolates) are needed. 

Previous studies have shown that serotypes included in 23-valent polysaccharide 

20 and 11-, 9-, 7-valent protein conjugate vaccines are those most frequently isolated from 
normally sterile sites (CSF, blood) (Colman et al., 1998; Huebner et al., 2000). Among 
173 consecutive pneumococcal "sterile site" isolates from adults in the CDDM 
diagnostic laboratory, over a 2.5-year period, correlation between the met and cs was 
good (171/173 CIDM isolates were correctly identified). The exceptions were two cs 

25 15B isolates that were identified as met 22F. Five serotypes (4, 14, 19F, 23F, 9V - 
covered by all pneumococcal vaccines) accounted for 57% of isolates. 

Five of 394 isolates studied were nontypable by both CS and MCT (Barker et 
al., 1999). Isolates may be nonserotypable because of decreased type-specific-antigen 
synthesis, nonencapsulated phase variation or insertion or mutation of genes of cps 

30 gene clusters. Failure to type them by MCT reflects the fact that the sequence database 
is still incomplete, although the target regions of two of the five nonserotypable isolates 
have been sequenced. 

In summary, the present inventors have developed a MCT system for S. 
pneumoniae, which is reproducible, can be performed by any laboratory with access to 

35 PCR/sequenping and does not require large panels of expensive serotype-specific 
antisera. Work on an international collection of isolates in our reference panels 
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• demonstrated a strong correlation between the cpsA-cpsB sequence and cs. 
Heterogeneity in a relatively short sequence (800bp) in this region, supplemented by 
type-specific PCR targeting wzx and wzy, correctly predicted the serotype of most 
unknown isolates belonging to 51 serotypes. These novel MCT methods provide 
5 comprehensive strain identification that will be useful for epidemiological studies that 
will be needed to monitor serotype distribution and detect serotype switching, if any, 
among S. pneumoniae isolates before and following introduction and widespread use of 
conjugate vaccines. 

10 It will be appreciated by persons skilled in the art that numerous variations 

and/or modifications may be made to the invention as shown in the specific 
embodiments without departing from the spirit or scope of the invention as broadly 
described. The present embodiments are, therefore, to be considered in all respects as 
illustrative and not restrictive. 

15 All publications discussed above are incorporated herein in their entirety. 

Any discussion of documents, acts, materials, devices, articles or the like which 
has been included in the present specification is solely for the purpose of providing a 
context for the present invention. It is not to be taken as an admission that any or all of 
these matters form part of the prior art base or were common general knowledge in the 

20 field relevant to the present invention as it existed before the priority date of each claim 
of this application. 
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1 gtcaaatctg tcttgattga aaacactgcg 
61 gctccatggg atgctttctg tgtggaatta 
121 attgaaatcg tataaaaaca agggaggact 
18 l v taaccaaagt ttataaactt tcattcttga 
241 taaggagaaa gaagatgaac ataaagaagc 
301 ctgctttgct tttagctgct tgcggccaat 
361 cctttagtgg aaatccaact acatttaact 
421 ttaattgaaa caagaacaag acaaaagagc 
4 81 ctttttgagg tgctttttga tatgagccca 
541 gtagggagga agaggtaaaa gtttataccc 
601 cattctatgg aatcttgcat .tatccataat 
661 gagaaatttc tgaaaccaag cttcaaaaaa 
721 tggagcgatt aactcaccat ttgttagacc 
781 tccagatact ttgcctcttc ttaactgacc 
8 41 ataagtatcg aatcctgttt catcaatcta 
901 attcttaaga aataaggcta ctttttctgg 
961 ttcgagtgta gcccatagct ttgagcgcat 
1021 cagaagctat ttcagtcaaa taagcgtctg 
1081 ctctatcaac ttttcttggt tttgttcctt 
1141 cttttagctt taaccagcca taaatggtat 
1201 ctgttatact acctgttcgc tcacaataag 
1261 atgccataag aagattatac cacattgtgt 
1321 cctctgcttc taaaacattg ttagaaatcg 
1381 cttatttcat tttactatat ttttgtttcg 
14 41 agatagtaaa aataaaggtg tagacattac 
1501 caatgtatag gtattaatca tgagtagacg 
1561 gcgaagtgtt aatatagttt tgctgactat 
1621 cttaatcttt aagtacaata tccttgcttt 
1681 agtcctacta gttgccttgg tagggctact 
1741 tactattttt ctgttggtgt tctctatcct 
1801 gcagtttgtt ggactgacca atcgtttaaa 
1861 cagtgtcgct gttttagcag atagtgagat 
1921 agcaccgact gggactaata atgaaaatat 
1981 tcagaatacc gatttgacgg tcaaccagag 
2041 gattgcaggg gagactaagg ccattgtcct 
2101 agagtatcca gactacgcat cgaagataaa 
2161 agtagaagct cctaagacgt ctaagagtca 
2221 cacctatggt cctattagtt cggtgtcgcg 
2281 tcgagatacc aagaaaatcc tcttgaccac 
2341 agatggtgga aataatcaaa aagataaatt 
2401 gtccattcac accttagaaa atctctatgg 
24 61 cttcacttcg tttttgaaat tgattgattt 
2521 agaatttact gcccatacga atggaaagta 
2581 agaacaggct ctcggttttg ttcgtgagcg 
2641 cgggcgccat caacaaaagg tgattgtggc 
2701 gctgaaaaat tatagtacga tcattaatag 
2761 acttgagacc atgataaatt tggtcaatgc 
2821 aaattctcaa gatttaaaag ggacaggtcg 
2881 cagtaacctc tatgtgatgg aaatagatga 
2941 acaggatgtg atggagggta gatgaaatga 
3001 tagatgacgg tcccaagtca agagaggaaa 



gctaaagaag tacttgaaaa acaggtcttg 
ctataaatat tttttgcaga aaaatttaaa 
gtataaaaga cagaaatcct ttgtttttta 
aattcaatta actttacaaa ttccpactat 
gtgtccttag tgcaggcctg acttttgcat 
caggttcaga tacaaaaact tactcatcaa 
atctattaga ctattacgct gataatatag 
ctcataaaag gtattgcaac ttggtaatac 
tgttttctca ataggattgt actcaggtga 
aaactcttca cacaagagtt ctaacttacc 
aataaccgat ggtgtgttta atgttggtaa 
gtcgctcgtc atcgtctctt cgtaagttat 
tgcaaccaaa gaaatcctct gatatcttct 
ttttaatgag cgaccatatt ctcgataaaa 
aacaggtgct aggtgcttta aactattaaa 
gttttgttca tagtaggtgt ggttcttttt 
agtggatggt agttggatga cagccaaatt 
gattgtcagt aagatagttt ttaagtctat 
ttacttggtg gtttagctct cctgttttct 
tacgtgatat ttggaaaacg tgtgatgctt 
agagaacttt tttacgaaaa tctattgaat 
actatattag attgaaacta gaatagtaca 
atttgactgt cctgaacgat ttgttctgtt 
cgggaagtct actaagatac ttaaagatgc 
cgtaaaaaag tgatataatc gtatagtgtt 
ttttaaaaaa tcacgttcac agaaagtgaa 
ttatttattg ttagtttgtt ttttattgtt 
tagatatctt' aatctagtgg taactgcgtt 
cttgattatc tataaaaaag ctgaaaagtt 
tgtcagctct gtgtcgctct ttgcagtaca 
tgcgacttct aattactcag aatattcaat 
cgaaaatgtt acgcaactga cgagtgtgac 
tcagaaatta ctagctgata tcaagtcaag 
ttcgtcttac ttggcagctt acaagagttt 
aaatagtgtc tttgaaaaca tcatcgagtc 
aaagatttat actaagggat tcactaaaaa 
gtctttcaat atctatgtta gtggaattga 
atcagatgtc aacatcctga tgactgtcaa 
aacgccacgt gatgcctatg taccaatcgc 
gact.catgcg ggcatttatg gagttgattc 
agtggatatc aattactatg tgcgattgaa 
gttgggtgga attgatgttt ataatgatca 
ttaccctgca gjgcaatgttc atcttgattc 
ctactcccta gcagatggcg atcgtgaccg 
tatccttcaa aaattaacgt caaccgaagt 
cttgcaagat tctatccaaa caaatatgcc 
tcagttagaa agtggaggga attataaagt 
gatggatctt ccttcttatg caatgccaga 
tagtagttta gctgtagtta aagcagctat 
tagacatcca ttcgcatatc gtttttgatg 
gcaaggctct cttggcagaa tcctacagac 
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3061 
3121 
3181 
3241 
3301 
3361 
3421 
3481 
3541 
3601 
3661 
3721 
3781 
3841 
3901 
3961 



agggggtgcg 
aagagaagat 
act-tggtcat 
aaaaaaagcg 
acactcctta 
cagtcattgc 
aactgatcga 
tttttggcga 
tggttcatgt 
aagcatatga 
acaatcctcg 
tacgatagaa 
tttaatagtg 
gccagaatat 
ggggttgaca 
tatcctttcg 



aaccattgtt 
agcagaaaac 
tgcttacggg 
gattccgacc 
tcgcgatatt 
ccacattgag 
tatgggctgt 
acgttataaa 
cattgcaagt 
ccttgttacc 
aaaaattgta 
atcgatgtat 
gcacttgtga 
acgagtacca 
aatcaggatt 
caggatgttt 



tctacctctc 
tttcttcagg 
gctgaaattt 
ctcaatgata 
catagcgcct 
cgctatgatg 
tacacgcaag 
ttcatgaaaa 
gatatgcaca 
caaaaatacg 
atggatcaac 
ttcaattagt 
caggtgcggg 
cgcgaattta 
tgcaggcagg 
tggaggaagt 



accgtcgcaa 
ttcgggaaat 
attacacacc 
gtcgttatgc 
tgagcaagat 
ctcttgaaaa 
taaatagttc 
aaagagctca 
atctagacgg 
gagaagcgaa 
taatttagga 
taaaagcttg 
ggcttttgca 
cgtagtgaat 
aacttatctg 
tgtttctgat 



gggcatgttt 
agctaaggaa 
agatgttctg 
cttgatagag 
cttgatgttg 
taatgaaaaa 
acatgtcctc 
gtatttttta 
tagacctcct 
ggctcaggaa 
gaaatgatga 
tggaaacgca 
tatagcactt 
cgcaatcaag 
gtaaaagact 



gaaactccgg 
gtggcgagtg 
gataagctgg 
tttagtatga 
ggaattactc 
cgcgttcgag 
aaacccaaac 
gagcaggatt 
catatggcag 
ctttttatag 
aagaacaaaa 
agctaatgat 
ttattgttaa 
gagacaagcc 
accgtgagat 



Figure 1 
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1 '50 

Serotype 25F a — g — ag 1- aat — ta-t g g-c ag — 

Serotype 38 — a — g — ag t- aat — ta-t g g-c ag — 

Serotype 19A t c — t- -g 1 — t c- 

Serotype 23B t c — t- -g 1 — t c- 

Serosubtype 6A-6B-q g — t t 1- -g 1 — t 

Sero subtype 6B~q g — t t 1- -g 1 — t 

Serotype 11B g t c — t- -g 1 — t 

Serotype llA-q a 

Serosubtype 6A-c 

Serosubtype 6A-ca 

Serosubtype 6A-g 

Serosubtype 15A-ca2 

Serosubtype 23F-C g — t — r- t 

Serotype 18B 

Serotype 18C 

Serotype 19F r 

Serotype 18F 

Serotype 1 

Serotype 18A 

Serotype 13 ■ — 

Serotype 20 

Serotype 9N ' a — t t — - 

Serosubtype 15B-c c- 

Serotype 16F -a 

Serosubtype 23A-23F 

Serosubtype 23F-23A 

Serosubtype 15B-q a — t t 

Serosubtype 15C-ca a — t t ' 

Serosubtype 10A-23F ' 

Serosubtype 23F-g 

Serosubtype 14-g . 

Serotype 29 a — t t 

Serotype 7F 

Serosubtype 14-c — t c — t- -g 1 — t c- 

Serosubtype 5-q 

Serosubtype 2-g 

Serotype 41F ; 

Serotype 31 

Serotype 42 : 

Serosubtype 5-c -r 

Serotype 8 — 

Serotype 33B : 

Serosubtype 33F-q 

Serosubtype llA-nz 

Serosubtype 15B-22F 

Serotype 22F 

Serotype 22A • 

Serosubtype 15A~cal — -c- 

Serotype 7C c- 

Serotype 9V t 

Serosubtype 6B-c 

Serotype 21 a — t t 

Serotype 10F a — t t 

Serotype 12F a — t t - " 

Serosubtype 2-q 

Serosubtype 6A-6B-g g — t t 1- -g 1 — -a-t — t 

Serosubtype 6B-g g — t — t 1- -g 1 — -a-t t — - 

Serosubtype 23A-ca g — t t 1- -g 1 — -a-t t 
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Serotype 37 g — t t t- -g 1 — -a-t t 

Serotype 17A g — t t 1 g 1 — -a-t t 

Serotype 34 g — t t 1 g 1 — -a-t t 

Serosubtype 17F-35B g — t t 1 g 1 — -a-t t 

Serotype 35B g — t t t- -g 1 — -a-t t 

Serotype 33A g — t t 1 g 1 — -a-t t 

Serosubtype 33F-g g — t t 1- -g 1 — -a-t t 

Serosubtype I7F-C g — t t 1 g 1 — -a-t t 

Serosubtype lOA-q g — t t 1 g 1 — -a-t t 

Serotype 4 g — t t 1- -g 1 — -a-t t 

Serotype 35F g — t t 1- -g 1 — -a-t t 

Serotype 3 g — t t 1- -g at — -a-t t 

Consensus TTTCTTGAAA ATGATTGACT TATTGGGAGG GGTAGATGTT CATAATGATC 



51 100 

Serotype 25F a- - at- -acgg-aa-g -c-at — t — t— a c — 

Serotype 38 a- at- -acgg-aa-g -c-at — t — t — a c — 

Serotype 19A 1 a- -t t — a 

Serotype 23B 1 — ca- -t t — a 

Serosubtype 6A-6B-q -g ca- tgc- aat — aaaa- -c-att-ta- t — t 

Serosubtype 6B-q -g ca- tgc- aat — aaaa- -c-att-ta- t- — t 

Serotype 11B -g— ca- tgc- aat — aaaa- -c-att-ta- t — t- 

Serotype llA-q 

Serosubtype 6A-c -a ; 

Serosubtype 6A-ca -a 

Serosubtype 6A-g -a 

Serosubtype 15A-ca2 

Serosubtype 23F-c . g — 

Serotype 18B 

Serotype 18C — — r 

Serotype 19F 

Serotype 18F — — a 

Serotype 1 ; 

Serotype 18A — w c- 

Serotype 13 1 

Serotype 20 1 

Serotype 9N a t r r - 

Serosubtype 15B-c 

serotype 16F 

Serosubtype 23A-23F 

Serosubtype 23F-23A ^ : > 

Serosubtype 15B-q 7 ' " 

Serosubtype 15C-q 

Serosubtype 15C-ca . 

Serosubtype 10A-23F 

Serosubtype 23F-g 

Serosubtype 14-g — 

Serotype 29 

Serotype 7F c 

Serosubtype 14-c 1 a- -t * 

Serosubtype 5-q 

Serosubtype 2-g 

Serotype 4 IF ? — . 

Serotype 31 

Serotype 42 

Serosubtype 5-c 

Serotype 8 

Serotype 33B 

Serosubtype 33F-q 

Serosubtype llA-nz : — 
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Serosubtype 15B-22F 

Serotype 22F 

Serotype 22A 

Serosubtype 15A-cal 

Serotype 7C ' — 

Serotype 9V 

Serosubtype 6B-c 

Serotype 21 

Serotype 10F — 

Serotype 12F — 

Serosubtype 2-q 

Serosubtype 6A-6B-g a- t — c-atacg aatggaaagt a 1 — c 1 

Serosubtype 6B-g a- t — c-atacg aatggaaagt a 1 — c 1 

Serosubtype 23A-ca a- t — c-atacg aatggaaagt a t-c c 1 

Serotype 37 g a- t— c-atacg aatggaaagt a t-a c 1 

Serotype 17A a- t — c-atacg aatggaaagt at — t-c c 1 

Serotype 34 a- t — c-atacg aatggaaagt a t-c c 1 

Serosubtype 17F-35B a- t — c-atacg aatggaaagt a t-c c 1 

Serotype 35B a- t — c-atacg aatggaaagt a t-c c 1 

Serotype 33A a- t — c-atacg aatggaaagt a t-c c 1 

Serosubtype 33F-g a- t— c-atacg aatggaaagt a t-c c 1 

Serosubtype 17F-C a- t — c T atacg aatggaaagt a t-c c 1 

Serosubtype lOA-q a- t — c-atacg aatggaaagt a t-c c 1 

Serotype 4 a- t — c-atacg aatggaaagt a t-c c 1 

Serotype 35F a- t — c-atacg aatggaaagt a — -t-c c 1 

Serotype 3 a- t — c-atacg aatggaaagt a t-a c- 1 

Consensus AAGAGTTTTC AGCTCTACAT GGGAAGTTCC ATTTCCCAGT AGGGAATGTC 

101 150 

Serotype 25F a-t a— tea — a — t c a -tc-t — t — 

Serotype 38 a-t- — a — tea — a — t c a -tc-t — t — 

Serotype 19A — ct 1- g— a g- 1 — c — 

Serotype 23B — ct 1- g — a g- 1 — c — 

Serosubtype 6A-6B-q 1 1- -a — ag a — c-c g- 1 — c — 

Serosubtype 6B-q 1 1- -a — ag a — c-c — g- 1 — c — 

Serotype 11B 1 1- -a — ag a — c-c g- 1 — 

Serotype HA-q a — -t g- c — 

Serosubtype 6A-c 

Serosubtype 6A-ca —r 

Serosubtype 6A-g — 

Serosubtype 15A-ca2 

Serosubtype 23F-C a — 

Serotype 18B - ; g — 

Serotype 18C g-- 

Serotype 19F — 

Serotype 18F 

Serotype 1 a — -t 

Serotype 18A a — -t » — 

Serotype 13 

Serotype 20 

Serotype 9N - 

Serosubtype ISB-c a — 

Serotype 16F a — -t g- c — 

Serosubtype 23A-23F — 

Serosubtype 23F-23A 

Serosubtype 15B-q . a — 

Serosubtype 15C-q. 1 a — 

Serosubtype 15C-ca a — 

Serosubtype 10A-23F 

Serosubtype 23F-g 

Serosubtype 14-g 
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Serotype 29 
Serotype 7F 
Serosubtype 14-c 
Serosubtype 5-q 
Serosubtype 2-g 



Serotype 8 
Serotype 33B 
Serosubtype 33F-q 
Serosubtype llA-nz 
Serosubtype 15B-22F 
Serotype 22F 
Serotype 22A 



Serosubtype 6B-c 

Serotype 21 

Serotype 10F 

Serotype 12F 

Serosubtype 2-q c 

Serosubtype 6A-6B-g 1 — t- -a — a c g- t 

Serosubtype 6B-g 1 — t- -a — a c g- 

Serosubtype 23A-ca 1 — t- -a — a c g- 

Serotype 37 1 — t- -a — a c g- 

Serotype 17A t--t~ -a — a -a-c g- c — 

Serotype 34 t--t- -a — a c g- c — 

Serosubtype 17F-35B 1 — t- -a — a c : — g- 

Serotype 35B 1 — t- -a — a c g- 

Serotype 33A 1 — t- -a — a c g- 

Serosubtype 33F-g 1 — t- -a — a c g- 

Serosubtype 17F-c 1 — t- -a — a c g- 

Serosubtype lOA-q : — t — t- -a — a c g- c — 

Serotype 4 1 — t- -a — a c g- c — 

Serotype 35F 1 — t- -a — a c g- 

Serotype 3 1 — t- -a — a c g- c — 

Consensus CATCTAGACT CTGAGCAGGC TCTAGGTTTT GTACGTGAAC GCTACTCACT 

151 200 

Serotype 25F a — t -a — a — g — t g c-c-ct- 

Serotype 38 a — t -a — a — g — t g c-c-ct- 

Serotype 19A g 1 — t-a -c — g t c — 

Serotype 23B g 1 — t-a -c — g t c — 

Serosubtype 6A~6B-q g 1 — t g t 

Serosubtype 6B-q g 1 — t g t 

Serotype 11B a 1 — t g t 

Serotype llA-q a c — t 

Serosubtype 6A-c — c -a- 

Serosubtype. 6A-ca — c a- 

Serosubtype 6A-g — c 

a- 

Serosubtype 15A-ca2 . — c a- 

Sero subtype 23F-C a c — t — c a- 

Serotype 18B — c a- 

Serotype 18C — c a- 

Serotype 19F — c a- 

Serotype 18 F t c a- 

Serotype 1 -a — c -a- 
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Serotype 18A t — c a- 

Serotype 13 

Serotype 20 — 

Serotype 9N 

Serosubtype 15B-c * 

Serotype 16F a c — t 

Serosubtype 23A-23F a c — t -c — g t 

Serosubtype 23F-23A a c — t -c — g t 

Serosubtype 15B-q 

Serosubtype 15C-q 

Serosubtype 15C-ca 

Serosubtype 10A-23F 1 a- 

Serosubtype 23F-g — t a- 

Serosubtype 14-g — 



Serotype 29. 
Serotype 7F 
Serosubtype 14-c 
Serosubtype 5-q 
Serosubtype 2-g 



Serotype 4 IF -c — g t- 

Serotype 31 -a — 

Serotype 42 -a — 

Serosubtype 5-c -a — 

Serotype 8 — 

Serotype 33B 1- -c — g t- 

Serosubtype 33F-q t- -c — g t- 

Sero subtype llA-nz 



Serosubtype 15B-22F 

Serotype 22F 

Serotype 22A ' • 

Serosubtype 15A-cal - 

Serotype 7C 

Serotype 9V 

. Serosubtype 6B-c : ■ — t a- 

Serotype 21 

Serotype 10F 

Serotype 12F 

Serosubtype 2-q -c — g t 

Serosubtype 6A-6B~g a c — t — c a- 

Serosubtype 6B-g a c — t — c a- 

Serosubtype 23A-ca a c — t — ■ -c — g t 

Serotype 37 a c — t c — g t 

Serotype 17A a c — t 

Serotype 34 a c — t 

Serosubtype 17F-35B a c — t -c — g t 

Serotype 35B a c — t -c — g t 

Serotype 33A a c — t -c — g t 

Serosubtype 33F-g a c — t -c — g t 

Serosubtype 17F-c a c — t -c — g t a- 

Serosubtype lOA-q a c --t -c — g t 

Serotype 4 a c — t -c — g c- t 

Serotype 35F a c — t g t ;- — c 

Serotype 3 a c — t -c — g t 

Consensus AGCCGATGGA GACCGTGACC GTGGTCGCAA CCAACAAAAG GTGATTGTGG 



201 250 



Serotype 


25? 














Serotype 
Serotype 


i 38 
19A 















Serotype 23B 1 — g~g g 1- 

sero subtype 6A-6B-q 1 --g- t 
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Serosubtype 6B-q 1 — g t 

Serotype 11B 1 — g tg c-ga 

Serotype HA-q -g 

Serosubtype 6A-c -a — ta — a- g — g 1 — t — a — g- ttt c g- 

Sero subtype 6A-ca -a — ta — a- g — g 1 — t — a — g- ttt c g- 

Serosubtype 6A-g -a — ta — a- g — g 1 — t — a — g- ttt c 

g- 

Sero subtype 15A-ca2 -a — ta g — g 1 — t — a — g- ttt c g- 

Serosubtype 23F-C -a — ta g — g 1 — t — a — g- ttt c a- 

Serotype 18B -a — ta g — g 1 — t — a — g- ttt c g- 

Serotype 18C -a — ta g — g 1 — t — a — g- ttt c g- 

Serotype 19F -a — ta g — g 1 — t — a — g- ttt c g- 

Serotype 18F -a — ta g — g — : 1 — t — a — g- ttt c g- 

Serotype 1 -a — ta «- g — g 1 — t — a — g- ttt c g- 

Serotype 18A -g — ta g — g 1 — t — a — g- tct g- c g- 

Serotype 13 — ; 

Serotype 20 

Serotype 9N 1 ; — 

Serosubtype 15B-c 7 1 c 

Serotype 16F — — - 

Serosubtype 23A-23F 

Serosubtype 23F-23A 

Serosubtype 15B-q -g 

Serosubtype 15C-q -g 

Serosubtype 15C-ca 

Serosubtype 10A-23F . 

Serosubtype 23F-g 

Serosubtype 14 -g 

Serotype 29 

Serotype 7F 

Serosubtype 14-c 

. Serosubtype 5-q 

Serosubtype 2-g 



Serotype 4 IF — 

Serotype 31 t- 

Serotype 42 t- 

Serosubtype 5-c t- 



Serotype 8 t 

Serotype 33B 

Serosubtype 33F-q 

Serosubtype llA-nz 

Serosubtype 15B-22F 

Serotype 22F — 

Serotype 22A 

Serosubtype 15A-cal r ~ 

Serotype 7C . 

Serotype 9V g 

Serosubtype 6B-c 

Serotype 21 

Serotype 10F 

Serotype 12F ; 

Serosubtype 2-q -g c- 

Serosubtype 6A-6B-g -a — ta — a- g — g 1 — t — a — g- ttt c g- 

Serosubtype 6B-g -a — ta — a- g — g 1 — t — a — g- ttt c g- 

Serosubtype 2 3A-ca c 

Serotype 37 

Serotype 17A 

Serotype 34 — ; 

Serosubtype 17F-35B . c 

Serotype 35B --c 

Serotype 33A -~ c 
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Serosubtype 33F-g c 

Serosubtype 17F-C c 

Serosubtype lOA-q ■? -t g 

Serotype 4 tg 

Serotype 35 P r. -g- 

Serotype 3 a 

Consensus CTATCCTTCA AAAATTAACG TCAACCGAAG CACTGAAAAA TTATAGTACG 

251 300 

Serotype 25F gc-ag -a — ag-g — a — a c-ct -t-caaca — 

Serotype 38 -gc-ag -a — ag-g — a — a c-ct -t-caaca — 

Serotype 19A 
Serotype 23B 
Serosubtype 6A-6B-q 
Serosubtype 6B-q 
Serotype 11B 
Serotype llA-q 

Serosubtype 6A-c — tc — c-ag -a g — c-t 

Serosubtype 6A-ca — tc — c-ag -a g — c-t 

Serosubtype 6A-g --tc — c-ag -a g — c-t — 



g 
















— g 

g 

g 









Serosubtype 15A-ca2 


— tc- 


-c- 


-ag 


-a 


— g~ 




Serosubtype 23F-C 


— tc- 


-c- 


-ag 


-a 


— g— 




Serotype 18 B 


— tc- 


-c- 


-ag 


-a 


— g— 




Serotype 18C 


— tc- 


-c- 


-ag 


-a 


— g— 




Serotype 19F 


— tc- 


-c- 


-ag 


-a 


— g~ 




Serotype 18 F 


— tc- 


-c- 


-ag 


-a 


~ g— 




Serotype 1 


— tc- 


-c- 


-ag 


-a 


— g~ 




Serotype IB A 


— tc- 


-c- 


-ag 


-a 


— g~ 





Serotype 13 

Serotype 20 ; — 

Serotype 9N 

Serosubtype 15B-C g- 

Serotype 16F g- 

Serosubtype 23A-23F 

Serosubtype 23F-23A 

Serosubtype 15B-q g- 

Ser ©subtype 15C-q g- 

Serosubtype 15C-ca g- 

Serosubtype 10A-23F g- 

Serosubtype 23F-g g- 

Serosubtype 14-g g- 

Serotype 29 

Serotype 7F 

Serosubtype 14-c g- 

Serosubtype 5-q 

Serosubtype 2-g 

Serotype 4 IF 

Serotype 31 

Serotype 42 

Serosubtype 5-c 

Serotype 8 — 

Serotype 33B . 

Serosubtype 33F-q 

Serosubtype llA-nz 

Serosubtype 15B-22F 

Serotype 22F 

Serotype 22A 

Serosubtype 15A-cal : g- 

Serotype 7C g- 

Serotype 9V 
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Serosubtype 6B-c g 

Serotype 21 : 

Serotype 10F 

Serotype 12F c g 

Serosubtype 2-q — t c — t — 

Serosubtype 6A-6B-g — tc — c-ag -a g — c-t 

Serosubtype 6B-g — tc — c-ag -a — g — c-t — - 

Serosubtype 23A-ca 

Serotype 37 * 

Serotype 17A 

Serotype 34 

Serosubtype 17F-35B ' * 

Serotype 35B 7 — 

Serotype 33A 

Serosubtype 33F-g 

Serosubtype 17F-c — 

Serosubtype lOA-q ' r 

Serotype 4 

Serotype 35F 

Serotype 3 g 

Consensus ATCATTAATA GCTTGCAAGA TTCTATCCAA ACAAATATGC CACTTGAGAC 

301 350 

Serotype 25F c — gg-c — a-c a — g — - a 

Serotype 38 c — gg-c — a-c a — g — a 

Serotype 19A c c 1 — a -eg — c — - — 

Serotype 23B c c ■ 1 — a -cg--c 

Serosubtype 6A-6B-q c c 1 — a -eg — c 

Serosubtype 6B-q c c — 1 — a -eg — c 

Serotype 11B c 1 — a -eg — c 

Serotype llA-q c 

Serosubtype 6 A-c g — — a — g g — g 

. Serosubtype 6A-ca g — — a — g g — 

Serosubtype 6A-g g — — a — g g — 

Serosubtype 15A-ca2 g — — a — g g — g 

Serosubtype 23F-C g — — a — g g — g 

Serotype 18B g — — a — g g — 

Serotype 18C g — — a — g g — 

Serotype 19F g — — a — g g — g 

Serotype 18F g — — a — g g — 

Serotype 1 g — — a — g g — 

Serotype 18A gg — — a — g -c g — c 

Serotype 13 : 

Serotype 20 • 

Serotype 9N 1 — a -c c 

Serosubtype 15B-c c g — — — t — a -c c 

Serotype 16F c — t — a -c c 

Serosubtype 23A-23F *- 

Serosubtype 23F-23A 

Serosubtype 15B-q 7 

Serosubtype 15C-q 

Serosubtype 15C-ca • - 

Serosubtype 10A-23F ■ 

Serosubtype 23F-g 

Serosubtype 14 -g 

Serotype 29 a : a 

Serotype 7F — ; g — 

Serosubtype 14-c • . 

Serosubtype 5-q 

Serosubtype 2-g a 
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Serotype 4 IF a 

Serotype 31 g — — a — g g — 

Serotype 42 g — — a — g g — 

Sero subtype 5-c g — — a — g g — 

Serotype 8 g — — a — g g — 1 

Serotype 33B ; — 

Serosubtype 33F-q 

Serosubtype llA-nz 1 " — 

Serosubtype 15B-22F * 

Serotype 22F 7 

Serotype 22A 

Serosubtype 15A-cal c ; — a 

Serotype 7C c- 

Serotype 9V c 

Serosubtype 6B-c 

Serotype 21 

Serotype 10F 

Serotype 12F 

Serosubtype 2-q g — — a — g g — 

Serosubtype 6A-6B~g g — — a — g g — — g g 

Serosubtype 6B-g g — — a — g g — g g 

Serosubtype '23A-ca • — 

Serotype 37 

Serotype 17A 

Serotype 34. • 

Serosubtype 17F-35B 

Serotype 35B 

Serotype 33A : 

Serosubtype 33F-g 

Serosubtype 17F-c . 

Serosubtype lOA-q c 

Serotype 4 c 

Serotype 35F g — — a — g g — — g 

Serotype 3 g — — a — g g — 

Consensus TATGATAAAT TTGGTCAATG CTCAGTTAGA AAGTGGAGGG AATTATAAAG 

351 400 

Serotype 25F — g g- 

Serotype 38 — g g- r 

Serotype 19A -g g — c — g — g tg — a- c c 

Serotype 23B g — c — g — g gg — a- c c 

Serosubtype 6A-6B-q g — c — g — g gg — a- c c 

Serosubtype 6B-q g — --- c — g — g gg — a- c c 

Serotype 11B g — c — g — g gg — a- c c 

Serotype llA-q — g c c 

Serosubtype 6A-c g — c — g 

Serosubtype 6A-ca g — c — g 

Serosubtype 6A-g g — c — g 

Serosubtype 15A~ca2 g 

Serosubtype 23F-C — g 

Serotype 18B — g 

Serotype 18 C — g 

Serotype 19F :. — g . 

Serotype 18F — g 

• Serotype 1 — g 

Serotype 18A — g c 

Serotype 13 : c 

Serotype 20 c 

Serotype 9N 
Serosubtype 15B-c 
Serotype 16F 
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Serosubtype 23A-23F c — 

Serosubtype 23F-23A -< c 

Serosubtype 15B-q c 

Serosubtype 15C-q c 

Serosubtype 15C-ca c 

Serosubtype 10A-23F c 

Serosubtype 23F-g c 

Serosubtype 14-g ■ — c 

Serotype 29 

Serotype 7F — g 

Serosubtype 14-c c 

Serosubtype 5-q — g c 

Serosubtype 2-g — — g 

Serotype 41F — g : : 

Serotype 31 — g 

Serotype 42 g 

Serosubtype 5-c — g 

Serotype 8 — g 

Serotype 33B : — g 

Serosubtype 33F-q — g — — 

Serosubtype IlA-nz — g — c 

Serosubtype 15B-22F — g c 

Serotype 22F — g c 

Serotype 22A — g * — c 

Serosubtype 15A-cal 

Serotype 7C ' 

Serotype 9V — g 

Serosubtype 6B-c — : — g 

Serotype 21 — c 

Serotype 10F c 

Serotype 12F c 

Serosubtype 2-q — g a 

Serosubtype 6B-g — g 

Serosubtype 23A-ca c 

Serotype 37 

Serotype 17A c 

Serotype 34 c — 

Serosubtype 17F-35B • c 

Serotype 35B c 

Serotype 33A c 

Serosubtype 33F-g c 

Serosubtype 17F-C 7 c 

Serosubtype lOA-q — g 

Serotype 4 : ' — g 

Serotype 35F 

Serotype 3 — g 

Consensus TAAATTCTCA AGATXTAAAA GGTACAGGTC GGATGGATCT TCCTTCTTAT 

401 % 450 

Serotype 25F 1- -t c-gt- g a 1 g-a cc- 

Serotype 38 1 1 c-gt- g a 1 g-a cc- 

Serotype 19A — g -t a ta-c c cc- 

Serotype 23B — g -t a ta-c- -c cc- 

Sero subtype 6A-6B-q — g -t ta-c- -c cc- 

Sero subtype 6B-q — g -t ■ — ta-c- -c-. cc- 

Serotype 11B — g ~t a ta-c- -c cc- 

Serotype llA-q — g -t ta-c- -c cc- 

Sero sub type 6A-c . 

Serosubtype 6A~ca 

Serosubtype 6A-g 
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Serosubtype 15A-ca2 a 

Serosubtype 23F-C 

Serotype 18B 

Serotype 18C 

Serotype 19F 

Serotype 18F 

Serotype 1 

Serotype 18A : 

Serotype 13' 

Serotype 20 

Serotype 9N 1 

Serosubtype 15B-C 

Serotype 16F : 

Serosubtype 23A-23F 

Serosubtype 23F-23A 

Serosubtype 15B-q' 

Serosubtype 15C-q 

Serosubtype 15C-ca 

Serosubtype 10A-23F 

Serosubtype 23F-g 

Serosubtype 14-g 

Serotype 29 . 

Serotype 7F c 

Serosubtype 14~c 

Serosubtype 5-q : * 1 — 

Serosubtype 2-g 

Serotype 41F 

Serotype 31 

Serotype 42 

Serosubtype 5-c * 

Serotype 8 ; a 

Serotype 33B 

Serosubtype 33F-q 

Serosubtype llA-nz : 

Serosubtype 15B-22F 1 — 

Serotype 22F 1 — r 

Serotype 22A — ■ 1 — 

Serosubtype 15A-cal 

Serotype 7C ■ 

Serotype 9V : — . c — 

Serosubtype 6B-c 

Serotype 21 c — 

Serotype 10 F 

Serotype 12F - 

Serosubtype 2-q g 

Serosubtype 6A-6B-g : 

' Serosubtype 6B-g ^ 

Serosubtype 23A-ca 

Serotype 37 

Serotype 17A 

Serotype 34 — 

Serosubtype 17F-35B " 

Serotype 35B ; — 

Serotype 33A 

Serosubtype 33F-g 

Serosubtype 17F-c 5 — 

Serosubtype lOA-q - : 

Serotype 4 

Serotype 35F 

Serotype 3 : 

Consensus GCAATGCCAG ACAGTAACCT CTATGTGATG GAAAT AGAT G ATAGTAGTTT 
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451 500 

Serotype 25F ct-a-cta-c a -a a — -t c tc — g-a-g 

Serotype 38 . ct-a-cta-c a-aa — -t c tc — g-a-g 

Serotype 19A t — atct — c a-t -t -t c 

Serotype 23B t — atct — c a-t -t -t c 

Serosubtype 6A-6B-q t — atct — c a-g -t c — -t c 

Sero subtype 6B-q t — atct — c a-g -t c — -t c 

Serotype 11B t — atct — c a-t -t -t g 

Serotype llA-q t — atct — c a-t 1 c — -t c 

Serosubtype 6A-c 

Serosubtype 6A-ca 

Serosubtype 6A-g — gca 



Serosubtype 15A-ca2 ;- — g 

Serosubtype 23F-c 

Serotype 18B c- c — 

Serotype 18C — 1 c- c — 

Serotype 19F 



Serotype 18F c- 

Serotype 1 -a a- 

Serotype 18A - t 

Serotype 13 

Serotype 20 . 

Serotype 9N ; — 

Serosubtype 15B-C — : - 

Serotype 16F 

Serosubtype 23A-23F a- 

Serosubtype 23F-23A a- 

Serosubtype 15B-q 

Serosubtype 15C-q 

Serosubtype -15C-ca 

Serosubtype 10A-23F : 

Serosubtype 23F-g 

Serosubtype 14 -g 

Serotype 29 

Serotype 7F r 

Serosubtype 14-c 

Serosubtype 5-q 

Serosubtype 2-g 

Serotype 4 IF : 

Serotype 31 

Serotype 42 

Serosubtype 5-c * 

Serotype. 8 

Serotype 33B 

Serosubtype 33F-q ■ 

Serosubtype llA-nz -r 

Serosubtype 15B-22F 

Serotype 22F 

Serotype 22A 

Serosubtype 15A-cal g 

Serotype 7C 

Serotype 9V 

Serosubtype 6B-c 1 — 

Serotype 21 

Serotype 10F 

Serotype 12F 

Serosubtype 2-q 

Serosubtype 6A-6B-g 

Serosubtype 6B-g ? 

Serosubtype 23A-ca a- 



Continue next page 



15/87 



Serotype 37 

Serotype 17A « 

Serotype 34 

Serosubtype 17F-35B 

Serotype 35B 

Serotype 33A 

Serosubtype 33F-g 

Serosubtype 17F-c 

Serosubtype LOA-q 

Serotype 4 

Serotype 35F 

Serotype 3 a- a 

Consensus AGCTGTAGTT AAAGCAGCTA TACAGGATGT GATGGAGGGT AGATGAAATG 

501 550 

Serotype 25F tg-t- 1 — c — ta — g -g 

Serotype 38 tg-t- 1 — c — ta g g 

Serotype 19A — t — t — t- c — c 

Serotype 23B — t — t — t- c — t — c 

Serosubtype 6A-6B-q — t — t — t- c — t — c 

Serosubtype 6B-q — t — t — 1~ c — t — c 

Serotype 11B — t — t — t- c — t — c 

Serotype llA-q — t — t — t- c — t — c 

Serosubtype 6A-c 

Serosubtype 6A-ca 

Serosubtype 6A-g -c 

Serosubtype 15A-ca2 — g 

Serosubtype 23F-C '. — 

Serotype 18B 1 

Serotype 18C 1 

Serotype 19F ■ - 

Serotype 18F 1 : 

Serotype 1 c — c 

Serotype 18A c 

Serotype 13 

Serotype 20 — 

Serotype 9N 

Serosubtype 15B-c ~c 

Serotype 16F a — 

Serosubtype 23A-23F 

Serosubtype 23F-23A 

Serosubtype 15B-q 

Serosubtype 15C-q 

Serosubtype 15C-ca 

Serosubtype 10A-23F 

Serosubtype 23F-g 

Serosubtype 14-g - 

Serotype 29 

Serotype 7F 1 

Serosubtype 14-c 

Serosubtype 5-q 

Serosubtype 2-g g a 

Serotype 41F g — ■ 

Serotype 31 • — 

Serotype 42 - 

Serosubtype 5-c 

Serotype 8 

Serotype 33B -- * . — c — c 

Serosubtype 33F-q c — c 

Serosubtype llA-nz c — c 
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Serosubtype 15B-22F 

Serotype 22F 

Serotype 22A 

Serosubtype 15A-cal g 

Serotype 7C 

Serotype 9V 

Serosubtype 6B-c 

Serotype 21 a — 

Serotype 10F 

Serotype 12F 

Serosubtype 2-q — 

Serosubtype 6A-6B-g 

Serosubtype 6B-g 

Serosubtype 23A-ca 

. Serotype 37 c — 

Serotype 17A c — c 

Serotype 34 c — c 

Serosubtype 17F-35B 

Serotype 35B 

Serotype 33A 

Serosubtype 33F-g 

Serosubtype 17F-c . 

Serosubtype lOA-q 

Serotype 4 : 

Serotype 35F 1 -c — c 

Serotype 3 

Consensus ATAGACATCC ATTCGCATAT CGTTXTTGAT GTAGATGACG GTCCCAAGTC 

551 600 

Serotype 25F c-t a— t — t-ga 1 -g tt g tgat — a— aa-t- 

. Serotype 38 c-t a— t —t-ga 1 -g tt g tgat — a— aa-t- 

serotype 19A 

Serotype 23B — t --a -g 

Serosubtype 6A-6B-q — t : a -g 

Serosubtype 6B-q : 1 a g - 

Serotype 11B — t a -g 

Serotype llA-q a -g 

Serosubtype 6A-c *- -g a 

Serosubtype 6A-ca -g a 

Serosubtype 6A-g — -g a 

Serosubtype 15A-ca2 

Serosubtype 23F-C 

Serotype 18B — . -g 

Serotype 18C g 

Serotype 19F 

Serotype 18F g 

Serotype 1 a 

Serotype 18A 1 -g a 

Serotype 13 — t a aag g-t— t-at —a — a— t- 

Serotype 20 — t a- aag g-t--t-at —a — a— t- 

Serotype 9N — t a aag g-t — t-at --a--a— t- 

Serosubtype 15B-C 

Serotype 16F -g a 

Serosubtype 23A-23F — a 

Serosubtype 23F-23A a 

Serosubtype 15B-q a 

Serosubtype 15C-q a 

Serosubtype 15C-ca a 

Serosubtype 10A-23F a 

Serosubtype 23F-g a 

Serosubtype 14 -g : at 
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Serotype 29 

Serotype 7F a 

Serosubtype 14-c at 

Serosubtype 5-q — a a - 

Serosubtype 2-g 

Serotype 41F 

Serotype 31 

Serotype 42 

Serosubtype 5-c — 

Serotype 8 — 

Serotype 33B g — a 

Serosubtype 33F-q . a 

Serosubtype HA-nz -g a a 

Serosubtype 15B-22F — a a- 

Serotype 22F — a a - 

Serotype 22A a ■ a a- 

Seros'ubtype 15A-cal 

Serotype 7C 

Serotype 9V 

Serosubtype 6B-c -g a 

Serotype 21 a 

Serotype 10F 

Serotype 12F 1 -g r-a 

Serosubtype 2-q ; — a 

Serosubtype 6A-6B-g -g a 

Serosubtype 6B-g -g a 

Serosubtype 23A-ca : a 

Serotype 37 

Serotype 17A g a -g a 

Serotype 34 g a -g a 

Serosubtype 17F-35B a a 

Serotype 35B a a 

Serotype 33A a a 

Serosubtype 33F-g a a 

Serosubtype 17F-c a 

Serosubtype lOA-q 

Serotype 4 a 

Serotype 35F g 1 a 

Serotype 3 — t 1 a aag g-t— t-at —a — a— t- 

Consensus AAGAGAGGAA AGCAAGGCTC TCTTGGCAGA ATCCTACAGG CAGGGGGTGC 

601 650 

Serotype 25F -g — a- t — a — a — c c — tc -a — t a — a 

Serotype 38 -g — a t — a — a — c c — tc -a — t a — a 

Serotype 19A 

Serotype 23B — a 

Serosubtype 6A-6B-q — a 

Serosubtype 6B-q : a 

Serotype 11B a 

Serotype llA-q a 

Serosubtype 6A~c 

Serosubtype 6A-ca - • 

Serosubtype 6A~g ; 

Serosubtype 15A-ca2 

Serosubtype 23F-C 1- 

Serotype 18B 

Serotype 18C — ■ 1 

Serotype 19F — ■ 

Serotype 18 F 

Serotype 1 
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Serotype 18A a 

Serotype. 13 tg g 1 — g — t a- -a — g a 

Serotype 20 tg g 1 — g — t a- -a — g a 



Serotype 9N 
Serosubtype 15B-C 
Serotype 16F 
Serosubtype 23A-23F 
Serosubtype 23F-23A 
Serosubtype 15B-q 
Serosubtype 15C-'q 
Serosubtype 15C-ca 
Serosubtype 10A-23F 
Serosubtype 23F-g 
Serosubtype 14-g 
Serotype 29 
Serotype 7F 
Serosubtype 14-c 
Serosubtype 5-q 
Serosubtype 2-g 




























Serotype 41F 1 

Serotype 31 1 1 

Serotype 42 1 1 

Serosubtype 5-c 1 1 

Serotype 8 1 

Serotype 33B 

Serosubtype 33F-q : 

Serosubtype HA-nz — 

Serosubtype 15B-22F 1 

Serotype 22F 1 

Serotype 22A 1 

Serosubtype 15A-cal 

Serotype 7C 

Serotype 9V 

Serosubtype 6B-c 

Serotype 21 1- 

Serotype 10F a 

Serotype 12F 1 

Serosubtype 2-q 

Serosubtype 6A-6B-g 

Serosubtype 6B-g . 

Serosubtype 23A-ca 

Serotype 37 

Serotype 17A. 

Serotype 34 — 1 

Serosubtype 17F-35B 

Serotype 35B 

Serotype 33A 

Serosubtype 33F~g 

Serosubtype 17F-C 

Serosubtype lOA-q 

Serotype 4 t 

Serotype 35F 

Serotype 3 tg g 1— g — t a a — g a 

Consensus GAACCATTGT CTCTACCTCT CACCGTCGCA AGGGCATGTT TGAAACTCCG 

651 700 

Serotype 25F c t— g t g-gc a- a- a ga~ 

Serotype 38 c 1 — g t g-gc- — a-a-a ga — 

Serotype 19A 

Serotype 23B 

Serosubtype 6A-6B-q : 
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Serosubtype 6B-q 

Serotype 11B 

Serotype llA-q ] 

Serosubtype 6A-c : 

Serosubtype 6A-ca 

Serosubtype 6A-g a 

Serosubtype 15A-ca2 

Serosubtype 23F-C " 

Serotype 18B 1 — a — a — 

Serotype 18C 1— a— a— 

Serotype 19F 1 — a — a — 

Serotype 18F a -t — a a 

Serotype 1 1 — a — a — 

Serotype 18A — a — a- -t ac t a — a — 1 1 — a — a — 

Serotype 13 a — a 1 ac t a — a — 1 1 — a — a — 

Serotype 20 a — a- -t ac t a — a — 1 -t — a — a — 

Serotype 9N -t — a — a — 

Serosubtype 15B-c ; ' -t — a — a — 

Serotype 16F : -t — a — a — 

Serosubtype 23A-23F 1 — a — a — 

Serosubtype 23F-23A 1— a— a — 

Serosubtype 15B-q t — a a 

Serosubtype 15C-q 1 — a — a — 

Serosubtype 15C-ca -t a a 

Serosubtype 10A-23F 1 — a — a — 

Serosubtype 23F-g 1 — a — a — 

Serosubtype 14-g -t a a 

Serotype 29 _t — a a 

Serotype 7F 1 — a — a 

Serosubtype 14-c ~t a a 

Serosubtype 5-q ~ 

Serosubtype 2-g 1 — a — a- 

Serotype 41F -t — a — a 

Serotype 31 ~t — a a 

Serotype 42 t — a — a — 

Serosubtype 5-c — a — a 

Serotype 8 g 1 — a — a — 

Serotype 33B a 

Serosubtype 33F-q 

Serosubtype llA-nz 

Serosubtype 15B-22F t 

Serotype 22F 

Serotype 22A 

Serosubtype 15A-cal 

Serotype 7C 

Serotype 9V 

Serosubtype 6B~c 

Serotype 21 — 

Serotype 10F 

Serotype 12F- 

Serosubtype 2-q "' 

Serosubtype 6A-6B-g 

Serosubtype 6B-g 

Serosubtype 23Arca -t — a — a 

Serotype 37 ~t~a — a 

Serotype 17A — . 

Serotype 34 ; 

Serosubtype 17F-35B 

Serotype 35B — 

Serotype 33A 
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Serosubtype 33F-g 
Serosubtype 17F-C 
Serosubtype lOA-q 
Serotype 4 
Serotype 35 F 
Serotype 3 
Consensus 



Serotype 25F 
Serotype 38 
Serotype 19A 
Serotype 23B 
Serosubtype 6A-6B-q 
Serosubtype 6B-q 
• Serotype 11B 
Serotype llA-q 
Serosubtype 6A-c 
Serosubtype 6A-ca 
Serosubtype 6A-g 

Serosubtype 15A-ca2 
Serosubtype 23F-C 
Serotype 18B 
Serotype 18C 
Serotype 19F 
Serotype 18 F 
Serotype 1 
Serotype 18A 
Serotype 13 
Serotype 20 
Serotype 9N 
Serosubtype 15B-c 
Serotype 16F 
Serosubtype 23A-23F 
Serosubtype 23F*-23A 
Serosubtype 15B-q 
Serosubtype 15C-q 
Serosubtype 15C-ca 
Serosubtype 10A-23F 
Serosubtype 23F-g 
Serosubtype 14 -g 
Serotype 29 
Serotype 7F 
Serosubtype 14-c 
Serosubtype 5-q 
Serosubtype 2-g 

Serotype 41F 
Serotype 31 
Serotype 42 
Serosubtype 5-c 
Serotype 8 
Serotype 33B 
Serosubtype 33F-q 
Serosubtype llA-nz 
Serosubtype 15B-22F 
Serotype 22F 
Serotype 22A 
Serosubtype 15A-cal 
Serotype 7C 
Serotype 9V 



• — a — a- -t-t-ac t a — a — 1 -t — a — a — 

GAAGAGAAGA TAGCAGAAAA' CTTTCTTCAG GTTCGGGAAA TAGCTAAGGA 



701 750 

ta-t — aga- — t aca- — tta a c 1 — t- 

ta-t — aga- — t aca- — tta a c 1 — t- 

g c « 1 

c g c 

c g _ g- 

c g g- 

C g 





























c— a-— g -a 










c — a — g- a 














———a— —aga— 


- 


-c 


c— -a — g — a 






* 














c — a — g — a 




a-'-aga- 






c — a — g — a 


1 — t- 






















a — aga- 






c — a — g — a 


1— t- 


a — aga- 






c — a — g — a 


1 — t- 


a — aga- 






c — a — g — a 


1 — t- 


a — aga- 






c — a — g — a 


1 — t- 


a — aga- 






c — a — g — a 


1— t- 


a — aga- 






c — a — g — a 


1— t- 


a — aga- 






c — a — g — a 


1 — t- 


a — aga- 






c — a — g — a 


1— t- 












a — aga- 






c — a — g — a 


1— t- 


a — aga- 






c — a — g — a 


1— t- 




























c — a — g — a 


1— t- 
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Serosubtype 6B-c 
Serotype 21 
Serotype 10F 
Serotype 12F 
Serosubtype 2-q 
Serosubtype 6A-6B-g 
Serosubtype 6B-g 
Serosubtype 23A-ca 
Serotype 37 
Serotype 17A 
Serotype 34 
Serosubtype 17F-35B 
Serotype 35B 
Serotype 33A 
Serosubtype 33F-g 
Serosubtype 17F-C 
Serosubtype lOA-q 
Serotype 4 
Serotype 35F 
Serotype 3 
Consensus 



-ga- 
-g— 



c — a — g — a 1 — t- 

c — a — g — a — t — t — t- 



a — aga- — t 

a — aga- — t 

g c 

g c 

g c 

g c 

g 

g 

g c 

g c- 

g c 1 

g c _ 

a — aga- — t c--a — g — a 1 — t- 

AGTGGCGAGT GACTTAGTCA TTGCTTATGG GGCTGAAATT TACTACACAC 



Serotype 25F 
Serotype 38 
Serotype 19>A 
Serotype 23B 
Serosubtype 6A-6B-q 
Serosubtype 6B-q 
Serotype 11B 
Serotype llA-q 
Serosubtype 6A-c 
Serosubtype* 6A-ca 
Serosubtype 6A-g 



751 

a ca-- 

a ca— 



800 



a- 
a- 



ca- 
ca- 



a— t 

a — t 



-g—a. 
-g— a. 



actt- 
actt- 



c — a- 
c--a- 



-tt-a- 
-tt-a- 



T r ~ g ~ 



c -- t 



a- 

a- 

a- 

a- 

a- 

a- 



Serosubtype 15A-ca2 
Serosubtype 23F-C 
Serotype 18B 
Serotype 18C 
Serotype 19F 
Serotype 18F 
Serotype 1 
Serotype 18A 
Serotype 13 
Serotype 20 
. Serotype 9N 
Serosubtype 15B-C 
Serotype 16F 
Serosubtype 23A-23F 
Serosubtype 23F-23A 
. Serosubtype 15B-q 
Serosubtype 15C-q 
Serosubtype 15C-ca 
Serosubtype 10A-23F 
Serosubtype 23F-g 
Serosubtype 14-g 
Serotype 29 
Serotype 7F 
Serosubtype 14-c 
Serosubtype 5-q 
Serosubtype 2-g 



tg- 
tg- 
tg- 
tg- 
tg- 
tg- 
tg- 
tg- 
tg- 
tg- 
tg- 
tg- 
tg- 
tg- 
tg- 
tg- 
tg- 
tg- 
tg- 
tg- 
tg- 
tg- 
tg- 



-a 
-a 
-a 
-a 
-a 
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-a- 
-a- 
-a- 
-a- 



-a 
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-a 
-a 
-a 
-a 
-a 
-a 
-a 
-a 
-a 
-a 
-a 



-ag a. a — 
-ag a. a — 
-ag a. a — 
-ag a. a — 
-ag a. a — 
-ag a. a — 
-ag a. a — 
-ag a. a — 
-ag a. a — 
-ag a /a — 
-ag a. a— 
-ag a. a — 
-ag a. a — 
-ag a. a — 
-ag a* a — 
-ag a. a — 
-ag a. a — 
-ag a. a — 
-ag a.a— 
-ag a.a — 
-ag a.a — 
-ag a.a — 
-ag a.a — 



-t 

-t 

-t 

- t _ 

-t- 

-t 

-t 

-t- 

-t 

-t 

-t 



-t 

-t - 

-t 

-t 

-t 

-t 

-t 

-t 
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-t- 
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Serotype 41F 

Serotype 31 

Serotype 42 

Seroaubtype S-c 

Serotype 8 

Serotype 33B 

Serosubtype 33P-q 

Serosubtype HA-nz . 

Serosubtype 15B-22F - 

Serotype 22F 

Serotype 22A 

Serosubtype 15A-cal a — 

Serotype 7C 

Serotype 9V 

Serosubtype 6B-c . 

Serotype 21 

Serotype 10F 1- c — t- -a a- 

Serotype 12F a — . 

Serosubtype 2-q a 

Serosubtype 6A-6B-g 

Serosubtype 6B-g 

Serosubtype 23A-ca tg c a — a a aa — a 1 

Serotype 37 tg "c a — a a aa — a 1 1 

Serotype 17A 

Serotype 34 , 

Serosubtype 17F-35B 

Serotype 35B 

Serotype 33A 

Serosubtype 33F-g 

Serosubtype 17F-C -. 

Serosubtype lOA-q 

Serotype 4 

Serotype 35F 

Serotype 3 tg c a — a a — a 1- 1 

Consensus CAGATGTTCT GGATAAGCTG GAAAAAAAGC GAGATTCCGA CCCTCAATGA 



Figure 2 
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1 atgaaattga agtttcttat aacaaattta tttcatgtct ttttgtctaa tctgattaca 
61 attgtcacat cggttatagt tgtactaatt ttaccaaaaa ttatgggagt aactgagtat 
121 agttattggc aactatatat tttttaccta acatatattg gtttttttca tctgggttgg 
181 attgatggaa tttatcttaa atatggtgga ttagagtacc agaatttaga taagaaacag . 

2 41 ttttattctc aaatacttca atttttcagt tttttaattt taatttcttt tctattattt 
301 ggttttaact tattgattgt gacagatcca aatgcaaaat atatttataa catgactatt 
361 attagtatga tagttacaaa tttaagaatg ttatttgttt atattttgca gatgacaaat 
421 cgattaaagg atagctctat aattctgata agtgatcgcg ttatatatat ttttctttta 
481 tttctgttta ttatatttaa atggcatgaa tacaaggtaa tgatttgggc ggatgtttta 
541 . ggaaggacat tttctctcct actttctttt tggacttgta aagatattgt ttttcaatcc 
601 ttatccgagt tcatattgga tctgagagag tcttttgaca atatccgtgt tggaatcaac 
661 ttaatgttat ccaatattgc aagtagtatg attattggta ttgttcgaat gggaattcaa 
721 tggaattgga atatcgaaac attcgggaaa gtatcactga tgctaagcat ctctaattta 
781 ttaatgactt ttattaatgc gattggttta gttgtctttc ctttgttaaa acggacaaaa 
841 acggaaaatt tatctaaaat ttattccaac ttaagaaatg ttttgatgct gatcatgttt 
901 gcaatattgc tcttttatta tcctttaaaa attattctag atctttggtt gccagcttat 
961 cgggatgcgt tgatttttat ggctcttatt tttcctatgt caatttatga agggaagat 



Figure 3 
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1 atgaaattga agtttcttat aacaaatttg tttcatgtct ttttgtctaa tctgattaca 

61 attgtcacat cggttatagt tgtactaatc ttaccaaaaa ttatgggagt aactgagtat 

121 agttattggc aactatatat tttttaccta acatatattg gtttttttca tctgggttgg 

181 attgatggaa tttatcttaa atatggtgga ttagagtacc agaatttaga taagaaacag 

241 ttttattctc aaatacttca attttccagt tttttaattt taatttcttt tctattattt 

301 ggttttaact tattgattgt gacagatcca aatgcaaaat atatttataa catgactatt 

361 attagtatga tagttacaaa tttaagaatg ttattcgttt atattttgca gatgacaaat 

421 cgattaaagg atagctctat aattctgata agtgatcgcg ttatatatat ttttctttta 

481 tttctgttta ttatatttaa atggcatgaa tacaaggtaa tgatttgggc ggatgtttta 

541 ggaaggacat tttctctcct actttctttt tggatttgta aagatattgt ttttcaatcc 

601 ttatccgagt tcatattgga tctgagagag tcttttgaca atatccgtgt tggaatcaat 

661 ttaatgttat ccaatattgc aagtagtatg attattggta ttgttcgaat gggaattcaa 

721 tggaattgga atatcgaaac attcgggaaa gtatcactga cgctaagcat ctctaattta 

781 ttaatgactt ttattaatgc gattggttta gttgtctttc ctttgttaaa acggacaaaa 

841 acggaaaatt tatctaaaat ttattccaac ttaagaaatg ttttgatgct gatcatgttt 

901 gcaatattgc tcttttatta tcctttaaaa attattctag atctttggtt gccagcttat 

961 cgggatgcgt tgatttttat ggctcttatt tttcctatgt caatttatga agggaagat 



Figure 4 



25/87 



1 atgaaattga agtttcttat aacaaatttg tttcatgtct ttttgtctaa tctgattaca 

61 attgtcacat cggttatagt tgtactaatt ttaccaaaaa ttatgggagt aactgagtat 

121 agttattggc aactatatat tttttaccta acatatattg gtttttttca tctgggttgg 

181 ' attgatggaa tttatcttaa atatggtgga ttagagtacc agaatttaga- taagaaacag 

241 ttttattctc aaatacttca attttccagt tttttaattt taatttcttt tctattattt 

'301 ggttttaact tattgattgt gacagatcca aatgcaaaat atatttataa catgaccatt 

361 attagtatga tagttacaaa tttaagaatg ttattcgttt atattttgca gatgacaaat 

4 21 cgattaaagg atagctctat aattctgata agtgatcgcg ttatatatat ttttctttta 

481 tttctgttta ttatatttaa atggcatgaa tacaaggtaa tgatttgggc ggatgtttta 

541 ggaaggacat tttctctcct actttctttt tggatttgta aagatattgt ttttcaatcc 

601 ttatccgagt tcatattgga tctgagagag tcttttgaca atatccgtgt tggaatcaat 

661 ttaatgttat ccaatattgc aagtagtatg attattggta ttgttcgaat gggaattcaa 

721 tggaattgga atatcgaaac attcgggaaa gtatcactga cgctaagcat ctctaattta 

781 ttaatgactt ttattaatgc gattggttta gttgtctttc ctttgttaaa acggacaaaa 

841 acggaaaatt tatctaaaat ttattccaac ttaagaaatg ttttgatgct gatcatgttt 

901 gcaatattgc tcttttatta tcctttaaaa attattctag atctttggtt gccagcttat 

961 cgggatgcgt tgatttttat ggctcttatt tttcctatgt caatttatga agggaagat 



Figure 5 
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1 m atgaaattga agtttcttat aacaaatttg tttcatgtct ttttgtctaa tctgattaca 

61 attgtcacat cggttatagt tgtactaatc ttaccaaaaa ttatgggagt aactgagtat 

121 agttattggc aactatatat tttttaccta acatatattg gtttttttca tctgggttgg 

181 attgatggaa tttatcttaa atatggtgga ttagagtacc agaatttaga taagaaacag 

241 ttttattctc aaatacttca atttttcagt tttttaattt taatttcttt tctattattt 

301 ggttttaact tattgattgt gacagatcca aatgcaaaat atatttataa catgactatt 

3 61 attagtatga tagttacaaa tttaagaatg ttattcgttt atattttgca gatgacaaat 

421 cgattaaagg atagctctat aattctgata agtgatcgcg ttatatatat ttttctttta 

481 tttctgttta ttatatttaa atggcatgaa tacaaggtaa tgatttgggc ggatgtttta 

541 ggaaggacat tttctctcct actttctttt tggatttgta aagatattgt ttttcaatcc 

601 ttatccgagt tcatattgga tctgagagag tcttttgaca atatccgtgt tggaatcaat 

661 ttaatgttat ccaatattgc aagtagtatg attattggta ttgttcgaat gggaattcaa 

721 tggaattgga atatcgaaac attcgggaaa gtatcactga cgctaagcat ctctaattta 

781 . ttaatgactt ttattaatgc gattggttta gttgtctttc ctttgttaaa acggacaaaa 

841 acggaaaatt tatctaaaat ttattccaac ttaagaaatg ttttgatgct gatcatgttt 

901 gcaatattgc tcttttatta tcctttaaaa attattctag atctttggtt gccagcttat 

961 cgggatgcgt tgatttttat ggctcttatt tttcctatgt caatttatga agggaagat 



Figure 6 
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1 atgaaattga agtttcttat aacaaattta tttcatgtct ttttgtctaa tctgattaca 

61 attgtcacat cggttatagt tgtactaatt ttaccaaaaa ttatgggagt aactgagtat 

121 agttattggc aactatatat tttttaccta acatatattg gtttttttca tctgggttgg 

181 attgatggaa tttatcttaa atatggtgga ttagagtacc agaatttaga taagaaacag 

241 ttttattctc aaatacttca atttttcagt tttttaattt taatttcttt tctattattt 

301 ggttttaact tattgattgt gacagatcca aatgcaaaat atatttataa catgactatt 

361 attagtatga tagttacaaa tttaagaatg ttatttgttt atattttgca gatgacaaat 

421 cgattaaagg atagctctat aattctgata agtgatcgcg ttatatatat ttttctttta 

481 tttctgttta ttatatttaa atggcatgaa tacaaggtaa tgatttgggc ggatgtttta 

5 41 ggaaggacat tttctctcct actttctttt tggacttgta aagatattgt ttttcaatcc 

601 ttatccgagt tcatattgga tctgagagag tcttttgaca atatccgtgt tggaatcaac 

661 ttaatgttat ccaatattgc aagtagtatg attattggta . ttgttcgaat gggaattcaa 

721 tggaattgga atatcgaaac attcgggaaa gtatcactga tgctaagcat ctctaattta 

781 ttaatgactt ttattaatgc gattggttta gttgtctttc ctttgttaaa acggacaaaa 

841 acggaaaatt tatctaaaat ttattccaac ttaagaaatg ttttgatgct gatcatgttt 

901 gcaatattgc tcttttatta tcctttaaaa'attattctag atctttggtt gccagcttat 

961 . cgggatgcgt tgatttttat ggctcttatt tttcctatgt caatttatga agggaagat 



Figure 7 
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1 atgaaattga agtttcttat aacaaatttg tttcatgtct ttttgtctaa tctgattaca 

61 attgtcacat cggttatagt tgtactaatc ttaccaaaaa ttatgggagt aactgagtat 

121 agttattggc aactatatat tttttaccta acatatattg gtttttttca tctgggttgg 

181 attgatggaa tttatcttaa atatggtgga ttagagtacc agaatttaga taagaaacag 

241 ttttattctc aaatacttca atttttcagt tttttaattt taatttcttt tctattattt 

301 ggttttaact tattgattgt gacagatcca aatgcaaaat atatttataa catgactatt 

361 attagtatga tagttacaaa tttaagaatg ttattcgttt atattttgca gatgacaaat 

421 cgattaaagg atagctctat aattctgata agtgatcgcg ttatatatat ttttctttta 

481 tttctgttta ttatatttaa atggcatgaa tacaaggtaa tgatttgggc ggatgtttta 

541 ggaaggacat tttctctcct actttctttt tggatttgta aagatattgt ttttcaatcc 

601 ttatccgagt tcatattgga tctgagagag tcttttgaca atatccgtgt tggaatcaat 

661 ttaatgttat ccaatattgc aagtagtatg attattggta ttgttcgaat gggaattcaa 

721 tggaattgga atatcgaaac attcgggaaa gtatcactga cgctaaacat ctctaattta 

781 ttaatgactt ttattaatgc gattggttta gttgtctttc ctttgttaaa acggacaaaa 

841 acggaaaatt tatctaaaat ttattccaac ttaagaaatg ttttgatgct gatcatgttt 

901 gcaatattgc tcttttatta tcctttaaaa attattctag. atctttggtt gccagcttat 

961 cgggatgcgt tgatttttat ggctcttatt tttcctatgt caatttatga agggaagat 



Figure 8 
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1 atgaaattga agtttcttat aacaaatttg tttcatgttc ttttgtctaa tctgattaca 

61 attcttacat cagttatagt tgtactaatt ttaccaaaaa ttatgggagt aactgagtat 

121 agttattggc aactatatat tttttaccta acatatattg gtttttttca tctgggatgg 

181 attgatggaa tttatcttaa atatggcgga ttagagtacc agaacttaga taagaaacag 

241 ttttattctc aaatacttca attttccagt tttttaattt taatttcttt tctattattt 

301 ggttttaact tattgactgt gacagatcaa aatgcaaaat atatttataa catgactatt 

361 attagtatga tagttacaaa tttaagaatg ttattcgttt atattttgca gatgacaaat 

421 cgattaaagg atagttccat cattctaatc agtgatcgcg ttatatatgt tattctttta 

481 ttcctgttta ttatatttaa atggcatgaa fcacaaggtaa tgatttgggc agatgttttg 

541 ggaaggacat tttctctcct actttctttt tggatttgta aagatattgt ttttcaatcc 

601 ttatccgagt ttatattgga tctgagagag tcttttgaca atatccgtgt tggaatcaat 

661 ttaatgttat ccaatattgc aagtagtatg attattggta ttgttcgaat gggaattcaa 

721 tggaattgga atatcgaaac attcgggaaa gtatcactga cgctaagcat ctctaattta 

781 ttaatgactt ttattaatgc gattggttta gttgtttttc ctttgttaaa acggacaaaa 

8 41 acggaaaatt tatctaaaat ttattccaac ttaagaaatg ttttgatgct tatcatgttc 

901 gcgattttgc tcatttacta tcctttaaaa attgtattag acctctggtt gccagcctat 

961 caagatgcct tgattttcat ggctcttatt tttcctatgt caatttatga agggaagat 



Figure 9 
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.1- atgaaattga agtttcttat aacaaatttg tttcatgtct ttttatctaa tctgattaca 

61 attgtcacat cggttatagt tgtactaatt ttaccaaaaa ttatgggagt aactgagtat 

121 agttattggc aactatatat tttttaccta acatatattg gtttttttca tctgggttgg 

181 attgatggaa tttatpttaa atatggtgga ttagagtacc agaatttaga taagaaacag 

241 ttttattctc aaatacttca atttttcagt tttttaattt taatttcttt tctattattt 

301 ggttttaact tattgattgt gacagatcca aatgcaaaat atatttataa catgactatt 

361 attagtatga tagttacaaa tttaagaatg ttattcgttt atattttgca gatgacaaat 

421 cgattaaagg atagctctat aattctgata agtgatcgcg tcatatatat ttttctttta 

481 tttctgttta ttatatttaa . atggcatgaa tacaaggta.a tgatttgggc ggatgtttta 

541 ggaaggacat tttctctcct actttctttt tggatttgta aagatattgt ttttcaatcc 

601 ttatccgagt tcatattgga tctgagagaa tcttttgaca atatccgtgt tggaatcaat 

661 ttaatgttat ccaatattgc aagtagtatg attattggta ttgttcgaat gggaattcaa 

721 tggaattgga atatcgaaac attcgggaaa gtatcactga cgctaagcat ctctaattta 

7 81 ttaatgactt ttattaatgc gattggttta gttgtctttc ctttgttaaa acggacaaaa 

841 acggaaaatt tatctaaaat ttattccaac ttaagaaatg ttttgatgct gatcatgttt 

901 gcaatattac tcttttatta tcctttaaaa attattctag atctttggtt gccagcttat 

961 cgggatgcgt tgatttttat ggctcttatt tttcctatgt caatttatga agggaagat 



Figure 10 
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1 atgaaattga agtttcttat aacaaatttg tttcatgtct ttttatctaa tctgattaca 

61 attgtcacat cggttatagt tgtactaatt ttaccaaaaa ttatgggagt aactgagtat 

121 agttattggc aactatatat tttttaccta acatatattg gtttttttca tctgggttgg 

181 attgatggaa tttatcttaa atatggtgga ttagagtacc agaatttaga taagaaacag 

241 ttttattctc aaatacttca atttttcagt tttttaattt taatttcttt tctattattt 

301 ggttttaact tattgattgt gacagatcca aatgcgaaat atatttataa catgactatt 

361 attagtatga tagttacaaa tttaagaatg ttattcgttt atattttgca gatgacaaat 

421 cgattaaagg atagctctat aattctgata agtgatcgcg tcatatatat ttttctttta 

481 tttctgttta ttatatttaa atggcatgaa tacaaggtaa tgatttgggc ggatgtttta 

541 ggaaggacat tttctctcct actttctttt tggatttgta aagatattgt ttttcaatcc 

601 ttatccgagt tcatattgga tctgagagaa tcttttgaca atatccgtgt tggaatcaat 

661 ttaatgttat ccaatattgc aagtagtatg attattggta ttgttcgaat gggaattcaa 

721 tggaattgga atatcgaaac attcgggaaa gtatcactga cgctaagcat ctctaattta 

781 ttaatgactt ttattaatgc gattggttta gttgtctttc ctttgttaaa acggacaaaa 

841 acggaaaatt tatctaaaat ttattccaac ttaagaaatg ttttgatgct gatcatgttt 

901 gcaatattac tcttttatta tcctttaaaa attattctag atctttggtt gccagcttat 

961 cgggatgcgt tgatttttat ggctcttatt tttcctatgt caatttatga agggaagat 



Figure 11 
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1 atgaaattga agtttcttat aacaaatttg tttcatgttc ttttgtctaa tctgattaca 

61 attcttacat cagttatagt tgtactaatt ttaccaaaaa ttatgggagt aactgagtat 

121 agttattggc aactatatat tttttaccta acatatattg gtttttttca tctgggatgg 

181 attgatggaa tttatcttaa atatggcgga ttagagtacc agaacttaga taagaaacag 

241 ttttattctc aaatacttca attttccagt tttttaattt taatttcttt tctattattt 

301 ggttttaact tattgactgt gacagatcaa aatgcaaaat atatttataa catgactatt 

361 attagtatga tagttacaaa tttaagaatg ttattcgttt atattttgca gatgacaaat 

421 cgattaaagg atagttccat cattctaatc agtgatcgcg ttatatatgt tattctttta 

481 ttcctgttta ttatatttaa atggcatgaa tacaaggtaa tgatttgggc agatgttttg 

541 ggaaggacat tttctctcct actttctttt tggatttgta aagatattgt ttttcaatcc 

601 ttatccgagt ttatattgga, tctgagagag tcttttgaca atatccgtgt tggaatcaat 

661 ttaatgttat ccaatattgc aagtagtatg attattggta ttgttcgaat gggaattcaa 

721 tggaattgga atatcgaaac attcgggaaa gtatcactga cgctaagcat ctctaattta 

781 ttaatgactt ttattaatgc gattggttta gttgtttttc ctttgttaaa acggacaaaa 

841 acggaaaatt tatctaaaat ttattccaac ttaagaaatg ttttgatgct tatcatgttc 

901 gcgattttgc tcatttacta tcctttaaaa attgtattag acctctggtt gccagcctat 

961 caagatgcct tgattttcat ggctcttatt tttcctatgt caatttatga agggaagat 



Figure 12 
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.1 atgcttttaa atttcttatt catatctatt tttctattaa ttatcattac atttatatta 

61 tttgaggggg atttttttca acctgcagta attttaacac tcacttattt tatttcgatt 

121 gcaagtgctc tagttaatag aaatgtttgg ggaacagaac tccatttcaa aacctttggt 

181 ttgatattgt taggggttgc tacatttatt atagtttcct tgttgacaaa attgtcgtac 

2 41 aggcctaaag tggagggaat ttcgtatgaa gaattgaaag aaataaatcc ttcaaagata 

301. atctatgtca ttcttctgat tctaaatctt gttatgctat ttctttatac ccgtgaaatt 

361 cagaaagtgg tattgttttc aggtagaagt ttttctaata ttacagattt gataagtaac 

421 tataggtacc tatcttatta ttcaaatgaa gtagaaataa gtggaatgat taatcaacta 

481 tctaaaatta ttccagcgac tacacttatt tctttatata tatttataaa taattatttt 

541 ataactaaac aaataaagaa aaatttcatt tatttgattc caatagctat attctttgtc 

601 tatgcaatca ttagtggtgg tagattgccc cttataaggt tagttgttgg agctctgttg 

661 atattgtata tatactctgt gtacgggagt cctaaatctc aacttaccaa aagttttaaa 

721 atgatcactc gctctctgtt tac 
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1 atgcttttaa atttcttatt catatctatt 

61 tttgaggggg atttttttca acctgcagta 

121 gcaagtgctc tagttaatag aaatgtttgg 

181 ttgatattgt taggggttgc tacatttgtt 

241 aggcctaaag tggagggaat ttcgcatgaa 

301 atctatgtca ttcttctgac tctaaatctt 

361 cagaaagtag tattgttttc aggtagaagt 

421 tataggtacc tatcttatta ttcaaatgaa 

481 caactatcta aaattattcc agcgactaca 

541. tatttta'taa ctaaacaaat aaagaaaaat 

601 tttgtctatg caatcattag tggtggtaga 

661 ctgttgatat tgtatatata ctctgtgtac 

721 tttaaaatga- tcactcgctc tctgtttac 



tttctattaa ttatcattac atttatatta 
attttaacaa tcgcttattt tatttcgatt 
ggaacagaac tccatttcaa aaccttttat 
atagtttcct tgttgacaaa attgtcgtac 
gaattgaaag aaataaatcc ttcaaagata 
gttatgttat ttctttatat ccgtgaaatt 
ttttctaata ttacagattt gataagtaac 
gtagaaaatc gtgtaagtgg aatgattaat 
cttatttctt tatatatatt tatgaataat 
ttcatttatt tgattccaat agctatattc 
ttgcccctta taaggttagt tgttggatct 
gggagtccta aatctcaact taccaaaagt 
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1 atgcttttaa atttcttatt catatctatt tttctattaa ttatcattac atttatatta 

61 tttgaggggg atttttttca acctgcagta attttaacaa tcgcttattt tatttcgatt 

121 gcaagtgctc tagttaateg aaatgtttgg ggaacagaac tccatttcaa aaccttttat 

181 ttgatattgt taggggttgc tacatttatt atagtttcct tgttgacaaa attgtcgtac 

241 aggcctaaag tggagggaat ttcgcatgaa gaattgaaag aaataaatcc ttcaaagata 

301 atctatgtca ttcttctgat tctaaatctt gttatgctat ttctttatat ccgtgaaatt 

361 cagaaagtgg tattgttttc aggtagaagt ttttctaata ttacagattt gataagtaac 

421 tataggtacc tatcttatta ttcaaatgaa gtagaaaatc gtgtaagtgg aatgattaat 

481 caactatcta aaattattcc agcgactaca cttatttctt tatatatatt tataaataat 

541 tattttataa ctaaacaaat aaagaaaaac ttcatttatt tgattccaat agctatattc 

601 tttgtctatg caatcattag tggtggtaga ttgcccctta taaggttagt tgttggagct 

661 ctgttgatat tgtatatata ctctgtgtac gggagtccta aatctcaact taccaaaagt 

721 tttaaaatga tcactcgctc tctgtttac 
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1 atgcttttaa atttcttatt catatctatt tttctattaa ttatcattac atttatatta 

61 tttgaggggg atttttttca acctgcagta attttaacaa tcgcttattt tatttcgatt 

121 gcaagtgctc tagttaatag aaatgtttgg ggaacagaac tccatttcaa aaccttttat 

181 ttgatattgt taggggttgc tacatttgtt atagtttcct tgttgacaaa attgtcgtac 

241 aggcctaaag tggagggaat ttcgcatgaa gaattgaaag aaataaatcc ttcaaagata 

301 atctatgtca ttcttctgac tctaaatctt gttatgttat ttctttatat ccgtgaaatt 

361 cagaaagtag tattgttttc aggtagaagt ttttctaata ttacagattt gataagtaac 

421 tataggtacc tatcttatta ttcaaatgaa gtagaaaatc gtgtaagtgg aatgattaat 

4 81 caactatcta aaattattcc agcgactaca cttatttctt tatatatatt tatgaataat 

541 tattttataa ctaaacaaat aaagaaaaat ttcatttatt tgattccaat agctatattc 

601 tttgtctatg caatcattag tggtggtaga ttgcccctta taaggttagt tgttggatct 

661 ctgttgatat tgtatatata ctctgtgtac gggagtccta aatctcaact taccaaaagt 

721 tttaaaatga tcactcgctc tctgtttac 
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1 atgcttttaa atttcttatt catatctatt 

61 tttgaggggg atttttttca acctgcagta 

121 gcaagtgctc tagttaatag aaatgtttgg 

181 ttgatattgt taggggttgc tacatttatt 

241 aggcctaaag tggagggaat ttcgtatgaa 

301 atctatgtca ttcttctgat tctaaatctt 

361 cagaaagtgg tattgttttc aggtagaagt 

421 tataggtacc tatcttatta ttcaaatgaa 

481 tctaaaatta ttccagcgac tacacttatt 

541 ataactaaac aaataaagaa aaatttcatt 

601 tatgcaatca ttagtggtgg tagattgccc 

661 atattgtata tatactctgt gtacgggagt 

721 atgatcactc gctctctgtt tac 



tttctattaa ttatcattac atttatatta 
attttaacac tcacttattt tatttcgatt 
ggaacagaac tccatttcaa aacctttggt 
atagtttcct tgttgacaaa attgtcgtac 
gaattgaaag aaataaatcc ttcaaagata 
gttatgctat ttctttatac ccgtgaaatt 
ttttctaata ttacagattt gataagtaac 
gtagaaataa gtggaatgat taatcaacta 
tctttatata tatttataaa taattatttt 
tatttgattc caatagctat attctttgtc 
cttataaggt tagttgttgg agctctgttg 
cctaaatctc aacttaccaa aagttttaaa 
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1 atgcttttaa atttcttatt catatctatt tttctattaa ttatcattac atttatatta 

61 tttgaggggg atttttttca acctgcagta attttaacaa tcgcttattt tatttcgatt 

121 gcaagtgctc tagttaatag. aaatgtttgg ggaacagaac tccatttcaa aaccttttat 

181 ttgatattgt taggggttgc tacatttgtt atagtttcct tgttgacaaa attgtcgtac 

241 aggcctaaag tggagggaat ttcgcatgaa gaattgaaag aaataaatcc ttcaaagata 

301 atctatgtca ttcttctgac tctaaatctt gttatgttat ttctttatat ccgtgaaatt 

3 61 cagaaagtag tattgttttc aggtagaagt ttttctaata ttacagattt gataagtaac 
421 tataggtacc tatcttatta ttcaaatgaa gtagaaaatc gtgtaagtgg aatgattaat 

4 81 caactatcta aaattattcc agcgactaca cttatttctt tatatatatt tatgaataat 
541 tattttataa ctaaacaaat aaagaaaaat ttcatttatt tgattccaat agctatattc 
601 tttgtctatg caatcattag tggtggtaga ttgcccctta taaggttagt tgttggatct 
661 ctgttgatat tgtatatata ctctgtgtac gggagtccta aatctcaact taccaaaagt 
721 tttaaaatga tcactcgctc tctgtttac 
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1 atgcttttaa atttcttatt catatctatt tttctattaa ttatcattac atttatatta 

61 tttgagggag atttgtttca acccgcagta attttaacac ttgcttattt tatttcgatt 

121 gcaagtgctc tagttaatag aaatgtttgg ggaacagaac tccatttcaa aacctttggt 

181 ttgatattgc taggggttgc tacatttatt atagtttcct tgttgacaaa attgtcgtac 

241 aaacctaaag tggagggaat ttcgtataaa gaattaaaag aaataaatcc ttcaaagata 

301 atatatggca ttcttctgat tctaaatctt gttatgctat ttctttatat ccatgaaatt 

361 cagaaagtgg tactgttttc aggtagaggt ttttctaata ttacagattt gataagtaac 

4 21 tataggtacc tatcttatta ttcaaatgaa gtagaagatc gtgtaagtgg aatgattaat 

481 caactagcta aaattattcc agcgactaca tttgtttctt tatatatatt tataaataat 

541 tattttataa cgaagcaaat aaagaaaaat ttcatttatt tgattccaat agctatattc 

601 tttgtctatg caatcattag tggtggtaga ctgcccctta taaggttagt tattggaact 

661 ctgttgatat tgtatatata ctctgtgtac gggagtcata aatctcaact taccaaaagt 

721 tttaaaatga tcactcgctc tctgtttac 



Figure 19 
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1 atgcttttaa atttcttatt catatctatt tttctattaa ttattattac atttatatta 

61 tttgaggggg atttttttca acctgcagta attttaacaa tcgcttattt tatttcgatt 

121 gcaagtgctc tagttaatag aaatgtttgg ggaacagaac tccatttcaa aaccttttat 

181 ttgatattgt taggggttgt tacatttgtt atagtttcct tgttgacaaa attgtcgtac 

241 aggcctaaag tggagggaat ttcgcatgaa gaattgaaag aaataaatcc ttcaaagata 

301 atctatgtca ttcttctgac tctaaatctt gttatgttat ttctttatat ccgtgaaatt 

361 cagaaagtag tattgttttc aggtagaagt ttttctaata ttacagattt gataagtaac 

421 tataggtacc tatcttatta ttcaaatgaa gtagaaaatc gtgtaagtgg aatgattaat 

481 caactatcta aaattattcc agcgactaca cttatttctt tatatatatt tatgaataat 

541 tattttataa ctaaacaaat aaagaaaaat 'ttcatttatt tgattccaat agctatattc 

601 tttgtctatg caatcattag tggtggtaga ttgcccctta taaggttagt tgttggagct 

661 ctgttgatat tgtatatata ctctgtgtac gggagtccta aatctcaact taccaaaagt 

721 tttaaaatga tcactcgctc tctgtttac 
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1 atgcttttaa atttcttatt 

61 tttgaggggg atttttttca 

121 gcaagtgctc tagttaatag 

181 ttgatattgt taggggttgt 

241 aggcctaaag tggagggaat 

301 atctatgtca ttcttctgac 

361 cagaaagtag tattgttttc 

421 tataggtacc tatcttatta 

4 81 caactatcta aaattattcc 

5 41 tattttataa ctaaacaaat 
601 tttgtctatg caatcattag 
661 ctgttgatat tgtatatata 
721 tttaaaatga tcactcgctc 



catatctatt tttctattaa ttattattac atttatatta 
acctgcagta attttaacaa tcgcttattt tatttcgatt 
aaatgtttgg ggaacagaac tccatttcaa aaccttttat 
tacatttgtt atagtttcct tgttgacaaa attgtcgtac 
ttcgcatgaa gaattgaaag aaataaatcc ttcaaagata 
tctaaatctt gttatgttat ttctttatat ccgtgaaatt 
aggtagaagt ttttctaata ttacagattt gataagtaac 
ttcaaatgaa gtagaaaatc gtgtaagtgg aatgattaat 
agcgactaca cttatttctt tatatatatt tatgaataat 
aaagaaaaat ttcatttatt tgattccaat agctatattc 
tggtggtaga ttgcccctta taaggttagt tgttggagct 
ctctgtgtac gggagtccta aatctcaact taccaaaagt 
tctgtttac 
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1 atgcttttaa atttcttatt catatctatt tttctattaa ttatcattac atttatatta 
61 tttgagggag atttgtttca acccgcagta attttaacac ttgcttattt tatttcgatt 
121 gcaagtgctc tagttaatag aaatgtttgg ggaacagaac tccatttcaa aacctttggt 
181 ttgatattgc taggggttgc tacatttatt atagtttcct tgttgacaaa attgtcgtac 

2 41 aaacctaaag tggagggaat ttcgtataaa gaatta.aaag aaataaatcc ttcaaagata 
301 atatatggca ttcttctgat tctaaatctt gttatgctat ttctttatat ccatgaaatt 
361 cagaaagtgg tactgttttc aggtagaggt ttttctaata ttacagattt gataagtaac 
4 21 tataggtacc tatcttatta ttcaaatgaa gtagaagatc gtgtaagtgg aatgattaat 
4 81 caactagcta aaattattcc agcgactaca tttgtttctt tatatatatt tataaataat 
541 tattttataa cgaagcaaat aaagaaaaat ttcatttatt tgattccaat agctatattc 
601 tttgtctatg caatcattag tggtggtaga ctgcccctta taaggttagt tattggaact 
6 61 ctgttgatat tgtatatata ctctgtgtac gggagtcata aatctcaact taccaaaagt 
721 tttaaaatga tcactcgctc tctgtttac 
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1 tttgaaatgg ttgtggagtt atagattctt tttatttagg ttaaatggta ttaaagaagg 

61 aaatagtgaa ttcgagaaga gttttaagat aattaggaga tactataaaa caggacgata 

121 gaatgagtaa atataaggaa ttagcaaaaa atacaggtat ttttgctttg gctaactttt 

181 catcaaagat tttaattttt ttgttagtac ctatatatac acgggtactt accactacgg 

241 aatatggttt ttatgactta gtctatacaa ctattcagct ttttgtacca atcttgacat 

301 taaatatatc tgaagccgtt atgaggttcc taatgaaaga tggtgtttct aaaaaatcag 

361 tcttttcaat tgctgtttta gatatattta ttggatcaat tgcttttgct ttattgttgt 

4 21 tagtaaataa cctgttttct ttatcagatt taatttctca atacagtatt tacatatttg 

4 81 taatctttgt tttctatacc ctaaataatt ttttgataca attttctaag ggaattgata 

541 aaattggtgt tacagctatc tctggggtca taagtacagc agttatgctt gccatgaatg 

601 tcattcttct agtagtattt gattg 



Figure 23 



» 



44/87 



1 tttgaaatgg ttgtggagtt atagattctt tttatttagg ttaaatggta ttaaagaagg 

61 aaatagtgaa ttcgagaaga gttttaagat aattaggaga tactataaaa caggacgata 

121 gaatgagtaa atataaggaa ttagcaaaaa atacaggtat ttttgctttg gctaactttt 

181 catcaaagat tttaattttt ttgttagtac ctatatatac acgggtactt accactacgg 

241 aatatggttt ttatgactta gtctatacaa ctattcagct ttttgtacca atcttgacat 

301 taaatatatc tgaagccgtt atgaggttcc taatgaaaga tggtgtttct aaaaaatcag 

361 tcttttcaat tgctgtttta gatatattta ttggatcaat tgcttttgct ttattgttgt 

421 tagtaaataa cctgttttct ttatcagatt taatttctca atacagtatt tacatatttg 

481 taatctttgt tttctatacc ctaaataatt ttttgataca attttctaag ggaattgata 

541 aaattggtgt tacagctatc tctggggtca taagtacagc agttatgctt gccatgaatg 

601 tcattcttct agtagtattt gattg 



Figure 24 
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1 tttgaaatgg ttgtggagtt atagattctt tttatttagg ttaaatggta ttaaagaagg 

61 aaatagtgaa ttcgagaaga gttttangat aattaggaga tactataaaa caggacgata 

121 gaatgagtaa atataaggaa ttagcaaaaa atacaggtat ttttgctttg gctaactttt 

181 catcaaagat tttaattttt ttgttagtac ctatatatac acgggtactt accactacgg 

241 aatatggttt ttatgactta gtctatacaa ctattcagct ttttgtacca atcttgacat 

301 . taaatatatc tgaagccgtt atgaggttcc taatgaaaga tggtgtttct aaaaaatcag 

361 tcttttcaat tgctgtttta gatatattta ttggatcaat tgcttttgct ttattgttgt 

421 tagtaaataa cctgttttct ttatcagatt taatttctca atacagtatt tacatatttg 

481 taatctttgt tttctatacc ctaaataatt ttttgataca attttctaag ggaattgata 

5 41 aaattggtgt tacagctatc tctgggatca taagtacagc agntatgctt gccatgaatg 

601 tcattcttct agtagtattt gattg 
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1 tttgaaatgg ttgtggagtt atagattctt tttatttagg ttaaatggta ttaaagaagg 

61 aaatagtgaa ttcgagaaga gttttaagat aattaggaga tactataaaa caggacgata 

121 gaatgagtaa atataaggaa ttagcaaaaa atacaggtat ttttgctttg gctaactttt 

181 catcaaagat tttaattttt ttgttagtac ctatatatac acgggtactt accactacgg 

241 aatatggttt ttatgactta gtctatacaa ctattcagct ttttgtacca atcttgacat 

301 taaatatatc tgaagccgtt atgaggttcc taatgaaaga tggtgtttct aaaaaatcag 

361 tcttttcaat tgctgtttta gatatattta ttggatcaat tgcttttgct ttattgttgt 

421 tagtaaataa cctgttttct ttatcagatt taatttctca atacagtatt tacatatttg 

481 taatctttgt tttctatacc ctaaataatt ttttgataca attttctaag ggaattgata 

541 aaattggtgt tacagctatc tctggggtca taagtacagc agttatgctt gccatgaatg 

601 tcattcttct agtagtattt gattg 



Figure 26 
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1 tttgaaatgg ttgtggagtt atagattctt tttatttagg ttaaatggta ttaaagaagg 

61 aaatagtgaa ttcgagaaga gttttaagat aattaggaga tactataaaa caggacgata 

121 gaatgagtaa atataaggaa ttagcaaaaa atacaggtat ttttgctttg gctaactttt 

181 catcaaagat tttaattttt ttgttagtac ctatatatac acgggtactt accactacgg 

241 aatatggttt ttatgactta gtctatacaa ctattcagct ttttgtacca atcttgacat 

301 taaatatatc tgaagccgtt atgaggttcc taatgaaaga tggtgtttct aaaaaatcag 

361 tcttttcaat tgctgtttta gatatattta ttggatcaat tgcttttgct ttattgttgt 

421 tagtaaataa cctgttttct ttatcagatt taatttctca atacagtatt tacatatttg 

481 taatctttgt tttctatacc ctaaataatt ttttgataca attttctaag ggaattgata 

541 aaattggtgt tacagctatc tctggggtca taagtacagc agttatgctt gccatgaatg 

601 tcattcttct agtagtattt gattg 



Figure 27 



48/87 



1 tttgaaatgg ttgtggagtt atagattctt 

61 aaatagtgaa ttcgagaaga gttttangat 

121 gaatgagtaa atataaggaa ttagcaaaaa 

181 catcaaagat tttaattttt ttgttagtac 

241 aatatggttt ttatgactta gtctatacaa 

301 taaatatatc tgaagccgtt atgaggttcc 

361 tcttttcaat tgctgtttta gatatattta 

4 21 tagtaaataa cctgttttct tta.tcagatt 

4 81 taatctttgt tttctatacc ctaaataatt 

541 aaattggtgt tacagctatc tctggggtca 

601 tcattcttct agtagtattt gattg 



tttatttagg ttaaatggta ttaaagaagg 

aattaggaga tactataaaa caggacgata 

atacaggtat ttttgctttg gctaactttt 

ctatatatac acgggtactt accactacgg 

ctattcagct ttttgtacca atcttgacat 

taatgaaaga tggtgtttct aaaaaatcag 

ttggatcaat tgcttttgct ttattgttgt 

taatttctca atacagtatt tacatatttg 

ttttgataca attttctaag ggaattgata 

taagtacagc agttatgctt gccatgaatg 
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1 tttgaaatgg ttgtggagtt atagattctt tttatttagg ttaaatggta ttaaagaagg 

61 aaatagtgaa ttcgagaaga gttttaagat aattaggaga tactataaaa caggacgata 

121 gaatgagtaa atataaggaa ttagcaaaaa atacaggtat ttttgctttg gctaactttt 

181 catcaaagat tttaattttt ttgttagtac ctatatatac acgggtactt accactacgg 

241 aatatggttt ttatgactta gtctatacaa ctattcagct ttttgtacca atcttgacat 

301 taaatatatc tgaagccgtt atgaggttcc taatgaaaga tggtgtttct aaaaaatcag 

361 tcttttcaat tgctgtttta gatatattta ttggatcaat tgcttttgct ttattgttgt 

421 tagtaaataa cctgttttct ttatcagatt taatttctca atacagtatt tacatatttg 

481 taatctttgt tttctatacc ctaaataatt ttttgataca attttctaag ggaattgata 

541 aaattggtgt tacagctatc tctggggtca taagtacagc agttatgctt gccatgaatg 

601 tcattcttct agnagtattt gattg 



Figure 29 
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1 tgcgctatta ataattttcc ttatgataac tactatattt atagagacct atatgttttt 

61 atttgtcatt tctttatact attctcttga ttttggggac gatagagatt gtcatgagaa 

121 acagtacatt actaattaat aataaaggtg tgaacagaaa taagaagaaa tgaaaatact 

181 aaaaaactat gcctacaatc tttcttatca attgttggtg atcatactcc ctatcattac 

241 gactccctat gtaacgaggg tttttagttc tgacgattta ggaacgtatg gctactttag 

301 ctccattgtt acctatttta ccttgcttgc aactcttggt gttgccaact acggtaccaa 

361 agagatttca gcacatcgta aggaaattgg gaagaatttc tggggaattt attctctcca 

421 gtttggtgca acttggctat ccattttgct ttatcttgcc ctttgtttct tatttacttc 

481 aatgcaaaat ccggtagctt atatattggg attaagttta gtgtcaaaag gtttggatat 

541 ttcttggtta tttcaaggtt tggaggattt tagaaagatt acagttcgga acatcactgt 

601 taagtta 



Figure 30 
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1 tgcgctatta ataattttcc ttatgataac tactatattt atagagacct atatgttttt 

61 atttgtcatt tctttatact attctcttga ttttggggac gatagagatt gtcatgagaa 

121 acagtacatt actaattaat aataaaggtg tgaacagaaa taagaagaaa tgaaaatact 

181 aaaaaattat gcctacaatc tttcttatca attgttggtg atcatactcc ctatcattac 

241 gactccctat gtaacgaggg tttttagttc tgacgattta ggaacgtatg gctactttag 

301 ctccattgtt acctatttta ccttgcttgc aactcttggt gttgccaact acggtaccaa 

3 61 agagatttca gcacatcgta aggaaattgg gaagaatttc tggggaattt attctctcca 
421 gtttggtgca acttggctat ccattttgct ttatcttgcc ctttgtttct tatttacttc 

4 81 aatgcaaaat ccggtagctt atatattggg attaagttta gtgtcaaaag gtttggatat 
541 ttcttggtta tttcaaggtt tggaggattt tagaaagatt acagttcgga acatcactgt 
601 taagtta 



Figure 31 



52/87 



1 tgcgctatta ataattttcc ttatgataac tactatattt atagagacct atatgttttt 

61 atttgtcatt tctttatact attctcttga ttttggggac gatagagatt gtcatgagaa 

121 acagtacatt actaattaat aataaaggtg tgaacagaaa taagaagaaa tgaaaatact 

181 aaaaaattat gcctacaatc tttcttatca attgttggtg atcatactcc ctatcattac 

241 gactccctat gtaacgaggg tttttagttc tgacgattta ggaacgtatg gctactttag 

301 ctccattgtt acctatttta ccttgcttgc aactcttggt gttgccaact acggtaccaa 

361 agagatttca gcacatcgta aggaaattgg gaagaatttc tggggaattt attctctcca 

421 gtttggtgca acttggctat ccattttgct ttatcttgcc ctttgtttct tatttacttc 

481 aatgcaaaat ccggtagctt atatattggg attaagttta gtgtcaaaag gtttggatat 

541 ttcttggtta tttcaaggtt tggaggattt tagaaagatt acagttcgga acatcactgt 

601 taagtta 



Figure 32 
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1 tgcgctatta ataattttcc ttatgataac 

61 atttgtcatt tctttatact attctcttga 

121 acagtacatt actaattaat aataaaggtg 

181 aaaaaactat gcctacaatc tttcttatca 

241 gactccctat gtaacgaggg tttttagttc 

301 ctccattgtt acctatttta ccttgcttgc 

361 agagatttca gcacatcgta aggaaattgg 

421 gtttggtgca acttggctat ccattttgct 

481 aatgcaaaat ccggtagctt atatattggg 

541 ttcttggtta tttcaaggtt tggaggattt 

601 taagtta 



tactatattt atagagacct atatgttttt 
ttttggggac gatagagatt gtcatgagaa 
tgaacagaaa taagaagaaa tgaaaatact 
attgttggtg atcatactcc ctatcattac 
tgacgattta ggaacgtatg gctactttag 
aactcttggt gttgccaact acggtaccaa 
gaagaatttc tggggaattt attctctcca 
ttatcttgcc ctttgtttct tatttacttc 
attaagttta gtgtcaaaag gtttggatat 
tagaaagatt acagttcgga acatcactgt 
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1 tgcgctatta ataattttcc 

61 atttgtcatt tctttatact 

121 acagtacatt actaattaat 

181 aaaaaattat gcctacaatc 

241 gactccctat gtaacgaggg 

301 ctccattgtt acctatttta 

361 agagatttca gcacatcgta 

4 21 gtttggtgca acttggctat 

481 aatgcaaaat ccggtagctt 

541 ttcttggtta tttcaaggtt 

601 taagtta 



ttatgataac tactatattt atagagacct atatgttttt 
attctcttga ttttggggac gatagagatt gtcatgagaa 
aataaaggtg tgaacagaaa taagaagaaa tgaaaatact 
tttcttatca attgttggtg atcatactcc ctatcattac 
tttttagttc tgacgattta ggaacgtatg gctactttag 
ccttgcttgc aactcttggt gttgccaact acggtaccaa 
aggaaattgg gaagaatttc tggggaattt attctctcca 
ccattttgct ttatcttgcc ctttgtttct tatttacttc 
atatattggg attaagttta gtgtcaaaag gtttggatat 
tggaggattt tagaaagatt acagttcgga acatcactgt 
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1 tgcgctatta ataattttcc ttatgataac tactatattt atagagacct atatgttttt 

61 atttgtcatt tctttatact attctcttga ttttggggac gatagagatt gtcatgagaa 

121 acagtacatt actaattaat aataaaggtg tgaacagaaa taagaagaaa tgaaaatact 

181 aaaaaattat gcctacaatc tttcttatca attgttggtg atcatactcc ctatcattac 

241 gactccctat gtaacgaggg tttttagttc tgacgattta ggaacgtatg gctactttag 

301 ctccattgtt acctatttta ccttgcttgc aactcttggt gttgccaact acggtaccaa 

361 agagatttca gcacatcgta aggaaattgg gaagaatttc tggggaattt attctctcca 

421 gtttggtgca acttggctat ccattttgct ttatcttgcc ctttgtttct tatttacttc 

481 aatgcaaaat ccggtagctt atatattggg attaagttta gtgtcaaaag gtttggatat. 

541 ttcttggtta tttcaaggtt tggaggattt tagaaagatt acagttcgga acatcactgt 

601 taagtta 
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1 tgcgctatta ataattttcc ttatgataac tactatattt atagagacct atatgttttt 

61 atttgtcatt tctttatact attctcttga ttttggggac gatagagatt gtcatgagaa 

121 acagtacatt actaattaat aataaaggtg tgaacagaaa taagaagaaa tgaaaatact 

181 aaaaaactat gcctacaatc tttcttatca attgttggtg atcatactcc ctatcattac 

241 gactccctat gtaacgaggg tttttagttc tgacgattta ggaacgtatg gctactttag 

301 ctccattgtt acctatttta ccttgcttgc aactcttggt gttgccaact acggtaccaa 

361 agagatttca gcacatcgta aggaaattgg gaagaatttc tggggaattt attctctcca 

421 gtttggtgca acttggctat ccattttgct ttatcttgcc ctttgtttct tatttacttc 

481 aatgcaaaat ccggtagctt atatattggg attaagttta gtgtcaaaag gtttggatat 

541 ttcttggtta tttcaaggtt tggaggattt tagaaagatt acagttcgga acatcactgt 

601 taagtta 



Figure 36 
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1 tgcgctatta ataattttcc ttatgataac 

61 atttgtcatt tctttatact attctcttga 

121 acagtacatt actaattaat aataaaggtg 

181 aaaaaactat gcctacaatc tttcttatca, 

241 gactccctat gtaacgaggg tttttagttc 

301 ctccattgtt acctatttta ccttgcttgc 

3 61 agagatttca gcacatcgta aggaaattgg 
421 gtttggtgca acttggctat ccattttgct 

4 81 aatgcaaaat ccggtagctt atatattggg 

5 41 ttcttggtta tttcaaggtt tggaggattt 
601 taagtta 



tactatattt atagagacct atatgttttt 
ttttggggac gatagagatt gtcatgagaa 
tgaacagaaa taagaagaaa tgaaaatact 
attgttggtg atcatactcc ctatcattac 
tgacgattta ggaacgtatg gctactttag 
aactcttggt gttgccaact acggtaccaa 
gaagaatttc tggggaattt attctctcca 
ttatcttgcc ctttgtttct tatttacttc 
attaagttta gtgtcaaaag ggttggatat 
tagaaagatt acagttcgga acatcactgt 



Figure 37 



58/87 



1 tgatcatact ccctatcatt acgactccct atgtaacgag ggtcttttct tcggatgatt 

61 tagggacgta tggttatttt aattccatcg ttacttattt tatcctctta gcgacgctag 

121 gagttgctaa ctatgggacc aaggtcattt cagggcatcg aaagcaaatt caaaaaaact 

181 ttttgggaat ctattctctg caattaggtg caacagttct ttctctgtcc ttgtatgctc 

241 ttctttgtct aactcttccc tttatgcaaa atccggtagc ctatattcta ggcttgagtt 

301 tagtttctaa aggtttagac atctcctggc tctttcaagg gttagaagat tttcgtaaaa 

3 61 ttacggtcag aaatatcaca gtgaagctt 



Figure 38 



59/87 



1 gagagtttgt acagtcactt actgaatcag 

61 aaaacattgt ttacaatgtc ttatatcaga 

121 caccttactt agcgcgtgtg ttaggtgcag 

181 ccattgcttt ttactttatg attctgtcca 

241 caatggcaca ggtacgaaca agtagagaac 

301 cggttcagtt gacgtgttca ctagtaatga 

361 ttgtgaatag ttttcagatt gtagcctata 

421 cagatgttag ttggtttttt tatggtcttg 

4 81 catttgttaa gttattaact ttaatatcta 

541 tctatttata tacctttata atggcaggga 

601 cattttt 



tagaggggag aatcttgcct aatttaaaga 
tcttagctgt aatagtaccg tttattacct 
agcaaattgg agtttattct tttacttatt 
tgttgggaat ttctaattat gggaatcgga 
atttgaatca agaattttcg aatatttacg 
ccgtctcata tttgatttat gcaacagtat 
tccaagtatt acatgtttta tcgtatgcaa 
aagagtttcg tattacggtt gctaggaatt 
tctttacatt tgtaaaaagc cctaatgata 
gtaccctgct tggtcagttg attacatggc 



Figure 39 
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1 gagagtttgt acagttattt actgaatcag 

61 aaaatattgt ttacaatgtc ttatatcaga 

121 caccttactt agcgcgtgtg ttaggtgcag 

181 ccattgcttt ttactttatg attctgtcca 

241 caatagcacg ggtacgaaca agtagagaac 

301 cggttcagtt gacgtgttca ctagtaatga 

3 61 ttgtgaatag ttttcagatt gtagcctata 

421 cagatgttag ttggtttttt tatggtcttg 

481 catttgttaa gttattaact ttaatatcta 

541 tctatttata tacctttata atggcagggg 

601 cattttt 



tagaggggag aatcttgcct agtttaaaga 
tcttagctgt aatagtacca tttattacct 
agcaaattgg agtttattct tttacttatt 
tgttggggat ttctaattat gggaatcgga 
atttgaatca ggaattttcg aatatttacg 
ccatctcata tttgatttat gcaacagtat 
tccaagtatt acatgtttta tcgtatgcaa 
aagagtttcg tattacggtt gctaggaatt 
tctttacatt tgtaaaaagc cctaatgata 
gtaccctgct tggtca'gttg attacatggc 
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1 gagagtttgt acagttattt actgaatcag tagaggggag aatcttgcct agtttaaaga 

61 aaaatattgt ttacaatgtc ttatatcaga tcttagctgt aatagtacca tttattacct 

121 caccttactt agcgcgtgtg ttaggtgcag agcaaattgg agtttattct tttacttatt 

181 ccattgcttt ttactttatg attctgtcca "tgttggggat ttctaattat gggaatcgga 

241 caatagcacg ggtacgaaca agtagagaac atttgaatca ggaattttcg aatatttacg 

301 cggttcagtt gacgtgttca ctagtaatga ccatctcata tttgatttat gcaacagtat 

361 ttgtgaatag ttttcagatt gtagcctata tccaagtatt acatgtttta tcgtatgcaa 

421 cagatgttag ttggtttttt tatggtcttg aagagtttcg tattacggtt gctaggaatt 

481 catttgttaa gttattaact ttaatatcta tctttacatt tgtaaaaagc cctaatgata 

541 tctatttata tacctttata atggcagggg gtaccctgct tggtcagttg attacatggc 

601 cattttt 



Figure 41 
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1 gagagtttgt acagtcactt actgaatcag tagaggggag aatcttgcct aatttaaaga 

61 aaaacattgt ttacaatgtc ttatatcaga tcttagctgt aatagtaccg tttattacct 

121 caccttactt agcgcgtgtg ttaggtgcag agcaaattgg agtttattct tttacttatt 

181 ccattgcttt ttactttatg attctgtcca tgttgggaat ttctaattat gggaatcgga 

241 caatagcaca ggtacgaaca agtagagaac atttgaatca agaattttcg aatatttacg 

301 cagttcagtt gacgtgttca ctagtaatga ccgtctcata tttgatttat gcaacagtat 

361 ttgtgaatag ttttcagatt gtagcctata tccaagtatt acatgtttta tcgtatgcaa 

421 cagatgttag ttggtttttt tatggtcttg aagagtttcg tattacggtt gctaggaatt 

481 catttgttaa gttattaact ttaatatcta tctttacatt tgtaaaaagc cctaatgata 

541 tctatttata tacctttata atggcaggga gtaccctgct tggtcagttg attacatggc 

601 aattttt 
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1 gagagtttgt acagtcactt actgaatcag 

61 aaaacattgt ttacaatgtc ttatatcaga 

121 caccttactt agcgcgtgtg ttaggtgcag 

181 ccattgcttt ttactttatg attctgtcca 

241 caatagcaca ggtacgaaca agtagagaac 

301 cagttcagtt gacgtgttca ctagtaatga 

3 61 ttgtgaatag ttttcagatt gtagcctata 

4 21 cagatgttag ttggtttttt tatggtcttg 
4 81 catttgttaa gttattaact ttaatatcta 
541 tctatttata tacctttata atggcaggga 
601 aattttt 



tagaggggag aatcttgcct aatttaaaga 
tcttagctgt aatagtaccg tttattacct 
agcaaattgg agtttattct tttacttatt 
tgttgggaat ttctaattat gggaatcgga 
atttgaatca agaattttcg aatatttacg 
ccgtctcata tttgatttat gcaacagtat 
tccaagtatt acatgtttta tcgtatgcaa 
aagagtttcg tattacggtt gctaggaatt 
tctttacatt tgtaaaaagc cctaatgata 
gtaccctgct tggtcagttg attacatggc 
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1 gagagtttgt acagtcactt actgaatcag tagaggggag 

61 aaaacattgt ttacaatgtc ttatatcaga tcttagctgt 

121 caccttactt agcgcgtgtg ttaggtgcag agcaaattgg 

181 ccattgcttt ttactttatg attctgtcca tgttgggaat 

241 caatagcaca ggtacgaaca agtagagaac atttgaatca 

301 cagttcagtt gacgtgttca ctagtaatga ccgtctcata 

361 ttgtgaatag ttttcagatt gtagcctata tccaagtatt 

421 cagatgttag ttggtttttt tatggtcttg aagagtttcg 

481 catttgttaa gttattaact ttaatatcta tctttacatt 

541 tctatttata tacctttata atggcaggga gtaccctgct 

601 aattttt 



aatcttgcct aatttaaaga 
aatagtaccg tttattacct 
agtttattct tttacttatt 
ttctaattat gggaatcgga 
agaattttcg aatatttacg 
tttgatttat gcaacagtat 
acatgtttta tcgtatgcaa 
tattacggtt gctaggaatt 
tgtaaaaagc cctaatgata 
tggtcagttg attacatggc 



Figure 44 
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1 gagagtttgt acagtcactt actgaatcag 

61 aaaacattgt ttacaatgtc ttatatcaga 

121 caccttactt agcgcgtgtg ttaggtgcag 

181 ccattgcttt ttactttatg attetgtcca 

241 caatagcaca ggtacgaaca agtagagaac 

301 cagttcagtt gacgtgttca ctagtaatga 

361 ttgtgaatag ttttcagatt gtagcctata 

421 cagatgttag ttggtttttt tatggtcttg 

481 catttgttaa gttattaact ttaatatcta 

5 41 tctatttata tacctttata atggcaggga 

601 aattttt 



tagaggggag aatcttgcct aatttaaaga 
tcttagctgt aatagtaccg tttattacct 
agcaaattgg agtttattct tttacttatt 
tgttgggaat ttctaattat gggaatcgga 
atttgaatca agaattttcg aatatttacg 
ccgtctcata tttgatttat gcaacagtat 
tccaagtatt acatgtttta tcgtatgcaa 
aagagtttcg tattacggtt gctaggaatt 
tctttacatt tgtaaaaagc cctaatgata 
gtaccctgct tggtcagttg attacatggc 



Figure 45 



66/87 



1 gagagtttgt acagtcactt actgaatcag tagaggggag aatcttgcct aatttaaaga 

61 aaaacattgt ttacaatgtc ttatatcaga tcttagctgt aatagtaccg tttattacct 

121 caccttactt agcgcgtgtg ttaggtgcag agcaaattgg agtttattct tttacttatt 

181 ccattgcttt ttactttatg attctgtcca* tgttgggaat ttctaattat gggaatcgga 

241 caatagcaca ggtacgaaca agtagagaac atttgaatca agaattttcg aatatttacg 

301 cagttcagtt gacgtgttca ctagtaatga ccgtctcata tttgatttat gcaacagtat 

361 ttgtgaatag ttttcagatt gtagcctata tccaagtatt acatgtttta tcgtatgcaa 

421 cagatgttag ttggtttttt tatggtcttg aagagtttcg tattacggtt gctaggaatt 

481 catttgttaa gttattaact ttaatatcta tctttacatt tgtaaaaagc cctaatgata 

541 tctatttata tacctttata atggcaggga gtaccctgct tggtcagttg attacatggc 
601 . aattttt 



Figure 46 
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1 gagagtttgt acagtcactt actgaatcag tagaggggag aatcttgcct aatttaaaga 

61 aaaacattgt ttacaatgtc ttatatcaga tcttagctgt aatagtaccg tttattacct 

121 caccttactt agcgcgtgtg ttaggtgcag agcaaattgg agtttattct tttacttatt 

181 ccattgcttt ttactttatg attctgtcca tgttgggaat ttctaattat gggaatcgga 

241 caatagcaca ggtacgaaca agtagagaac atttgaatca agaattttcg aatatttacg 

301 cagttcagtt gacgtgttca ctagtaatga ccgtctcata tttgatttat gcaacagtat 

361. ttgtgaatag ttttcagatt gtagcctata tccaagtatt acatgtttta tcgtatgcaa 

4 21 cagatgttag ttggtttttt tatggtcttg aagagtttcg tattacggtt gctaggaatt 

481 catttgttaa gttattaact ttaatatcta tctttacatt tgtaaaaagc cctaatgata 

541 tctatttata. tacctttata atggcaggga gtaccctgct tggtcagttg attacatggc 

601 aattttt 



Figure 47 
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1 caggatatag atgcatgtta gattagatgg 

61 gattattact tgtaatacta tgtattcaac 

121 ggcgactctg ttacttgtgg tatcagtttt 

181 gaaggcaatc aatgtgattg tgtcgcgttc 

241 agtaatatta aatggtttta agatttctga 

301 tccgattttt atgatgattt tgcagatgta 

361 acggaaattt gttcgtataa tatttctttt 

421 tggtagtgta tttcatatta tatccccaac 

481 gggaatagta gaagggtact ataatcttca 

541 ggcgata 



tttgctggac tatatatttc tatttagtgt 
tagtcaagga tttgatggac tagggaaatg 
tctgaaattg cttatctcta gaatatctat 
tttaatattt atattaatta ttctactcat 
gacaagtttc gtctattatt ttgtattatt 
ctatgatgtt aatgaaatcg caaatctgat 
agcaattggc tctctcctat tttggcttat 
ggtttatgtg ttgaattatt ggaatggtgg 
ttttgaagca caaaaaatag agattttggg 



Figure 48 
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1 caggatatag atgcatgtta gattagatgg tttgctggac tatatatttc tatttagtgt 

61 gattattact tgtaatacta tgtattcaac tagtcaagga tttgatggac tagggaaatg 

121 ggcgactctg ttacttgtgg tatcagttat tctgaaattg cttatctcta gaatatctat 

181 gaaggcaatc aatgtgattg tgtcgcgttc tttaatattt atattaatta ttctactcat 

241 agtaatatta aatggtttta agatttctga gacaagtttc gtctattatt ttgtattatt 

301 tccgattttt atgatgattt tgcagatgta ctatgatgtt aatgaaatcg caaatctgat 

361 acggaaattt gttcgtataa tatttctttt agcaattggc tctctcctat tttggcttat 

421 tggtagtgta tttcatatta tatccccaac ggtttatgtg ttgaattatt ggaatggtgg 

481 gggaatagta gaagggtact ataatcttca ttttgaagca caaaaaatag agattttggg 

541 ggcgata 



Figure 49 
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1 caggatatag atgcatgtta gattagatgg 

61 gattattact tgtaatacta tgtattcaac 

121 ggcgactctg ttacttgtgg tatcagttat 

181 gaaggcaatc aatgtgattg tgtcgcgttc 

241 agtaatatta aatggtttta agatttctga 

301 tccgattttt atgatgattt tgcagatgta 

3 61 acggaaattt gttcgtataa tatttctttt 

421 tggtagtgta tttcatatta tatccccaac 

481 gggaatagta gaagggtact ataatcttca 

541 ggcgata 



tttgctggac tatatatttc tatttagtgt 
tagtcaagga tttgatggac * tagggaaatg 
tctgaaattg cttatctcta gaatatctat 
tttaatattt atattaatta ttctactcat 
gacaagtttc gtctattatt ttgtattatt 
ctatgatgtt aatgaaatcg caaatctgat 
agcaattggc tctctcctat tttggcttat 
ggtttatgtg ttgaattatt ggaatggtgg 
ttttgaagca caaaaaatag agattttggg 



Figure 50 
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1 caggatatag a-tgcatgtta gattagatgg tttgctggac tatatatttc tatttagtgt 

61 gattattact tgtaatacta tgtattcaac tagtcaagga tttgatggac tagggaaatg 

121 ggcgactctg ttacttgtgg tatcagttat tctgaaattg cttatctcta gaatatctat 

181 gaaggcaatc aatgtgattg tgtcgcgttc tttaatattt atattaatta ttctactcat 

241 agtaat'atta aatggtttta agatttctga gacaagtttc gtctattatt ttgtattatt 

301 tccgattttt atgatgattt tgcagatgta ctatgatgtt aatgaaatcg caaatctgat 

361 acggaaattt gttcgtataa tatttctttt agcaattggc tctctcctat tttggcttat 

421 tggtagtgta tttcatatta tatccccaac ggtttatgtg ttgaattatt ggaatggtgg 

481 gggaatagta gaagggtact ataatcttca ttttgaagca caaaaaatag agattttggg 

541 ggcgata 



Figure 51 
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1 caggatatag atgcatgtta gattagatgg tttgctggac tatatatttc tatttagtgt 

61 gattattact tgtaatacta tgtattcaac tagtcaagga tttgatggac tagggaaatg 

121 ggcgactctg ttacttgtgg tatcagtttt tctgaaattg cttatctcta gaatatctat 

181 gaaggcaatc aatgtgattg tgtcgcgttc tttaatattt atattaatta ttctactcat 

241 agtaatatta aatggtttta agatttctga gacaagtttc gtctattatt ttgtattatt 

301 tccgattttt atgatgattt tgcagatgta ctatgatgtt aatgaaatcg caaatctgat 

3 61 acggaaattt gttcgtataa tatttctttt agcaattggc tctctcctat tttggcttat 
421 tggtagtgta tttcatatta tatccccaac ggtttatgtg ttgaattatt ggaatggtgg 

4 81 gggaatagta gaagggtact ataatcttca ttttgaagca caaaaaatag agattttggg 
541 ggcgata 



Figure 52 
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1 caggatatag atgcatgtta gattagatgg 

61 gattattact tgtaatacta tgtattcaac 

121 ggcgactctg ttacttgtgg tatcagttat 

181 gaaggcaatc aatgtgattg tgtcgcgttc 

241 agtaatatta aatggtttta agatttctga 

301 tccgattttt atgatgattt tgcagatgta 

361 acggaaattt gttcgtataa tatttctttt 

421 tggtagtgta tttcatatta tatccccaac 

4 81 gggaatagta gaagggtact ataatcttca 

541 ggcgata 



tttgctggac tatatatttc tatttagtgt 
tagtcaagga tttgatggac tagggaaatg 
tctgaaattg cttatctcta gaatatctat 
tttaatattt atattaatta ttctactcat 
gacaagtttc gtctattatt ttgtattatt 
ctatgatgtt aatgaaatcg caaatctgat 
agcaattggc tctctcctat tttggcttat 
ggtttatgtg ttgaattatt ggaatggtgg 
ttttgaagca caaaaaatag agattttggg 



. Figure 53 



74/87 



1 caggatatag atgcatgtta gattagatgg 

61 gattattact tgtaatacta tgtattcaac 

121 ggcgactctg ttacttgtgg tatcagtttt 

181 gaaggcaatc aatgtgattg tgtcgcgttc 

241 agtaatatta aatggtttta agatttctga 

301 tccgattttt atgatgattt tgcagatgta 

361 acggaaattt gttcgtataa tatttctttt 

421 tggtagtgta tttcatatta tatccccaac 

481 gggaatagta gaagggtact ataatcttca 

541 ggcgata 



tttgctggac tatatatttc tatttagtgt 
tagtcaagga tttgatggac tagggaaatg 
tctgaaattg cttatctcta gaatatctat 
tttaatattt atattaatta ttctactcat 
gacaagtttc gtctattatt ttgtattatt 
ctatgatgtt aatgaaatcg caaatctgat 
agcaattggc tctctcctat tttggcttat 
ggtttatgtg ttgaattatt ggaatggtgg 
ttttgaagca caaaaaatag agattttggg 



Figure 54 
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1 caggatatag atgcatgtta gattagatgg 

61 gattattact tgtaatacta tgtattcaac 

121 ggcgactctg ttacttgtgg tatcagtttt 

181 gaaggcaatc aatgtgattg tgtcgcgttc 

241 agtaatatta aatggtttta agatttctga 

301 tccgattttt atgatgattt tgcagatgta 

361 acggaaattt gttcgtataa tatttctttt 

421 tggtagtgta tttcatatta tatccccaac 

481 gggaatagta gaagggtact ataatcttca 

541 ggcgata 



tttgctggac tatatatttc tatttagtgt 
tagtcaagga tttgatggac tagggaaatg 
tctgaaattg cttatctcta gaatatctat 
tttaatattt atattaatta ttctactcat 
gacaagtttc gtctattatt ttgtattatt 
ctatgatgtt aatgaaatcg caaatctgat 
agcaattggc tctctcctat tttggcttat 
ggtttatgtg ttgaattatt ggaatggtgg 
ttttgaagca caaaaaatag agattttggg 



Figure 55 
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1 tttttagaac gtactcattt atttaaaagg 

61 atccaatcgc gatactgtat atattgctag 

121 atctgtatcc tgtaacgtac atgacaaaat 

181 ttgttttgtt attagtaggc cgtgggaagc 

241 ttgctaagat attagctata ccaacaattg 

301 taatgaaccc agttgaattt gatggatatt 

361 gtttgttagc tatctttcaa gctatagttg 

421 attacacttt tacagctatc tccctcagct 

481 agggaggact tagtcaattt atcttgatgc 

541 tagaaat 



aagtaatagt gaaatttaaa tttaaattta 
tatacttaga gttagctaca gataggcaac 
attatattgg tattttaatc actgtgttgt 
ttatttttgt taataaaaaa ttattatatc 
ttcttttcct gtactcagtc ttactagacg 
ttagtaggtt atcaagtacg actatttttg 
tttttcaatt ttttggacaa aaagtagtag 
acttaaccag tatcattgtt gcctttaggc 
taacagatga tagtttcaat ggttcggtac 



Figure 56 



77/87 



1 tttttagaac gtactcattt atttaaaagg aagtaatagt gaaatttaaa tttaaattta 

61 atccaatcgc gatactgtat atattgctag tatacttaga gttagctaca gataggcaac 

121 atctgtatcc tgtaacgtac atgacaaaat attatattgg tattttaatc actgtgttgt 

181 ttgttttgtt attagtaggc cgtgggaagc ttatttttgt taataaaaaa ttattatatc 

241 ttgctaagat attagctata ccaacaattg ttcttttcct gtactcagtc ttactagacg 

301 taatgaaccc agttgaattt gatggatatt ttagtaggtt atcaagtacg actatttttg 

361 gtttgttagc tatctttcaa gctatagttg . tttttcaatt ttttggacaa aaagtagtag 

421 at'tacacttt tacagctatc tccctcagct acttaaccag tatcattgtt gcctttaggc 

481 agggaggact tagtcaattt atcttgatgc taacagatga tagtttcaat ggttcggtac 

541 tagaaat 



Figure 57 
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1 tttttagaac atactcattt atttaaaagg aaataatagt gaaatttaaa tttaatccaa 

61 tcgcgatact gtatatattg ctagtatact tagagttggc tacagatagg caacatctgt 

121 atcctgtaac gtacatgaca aaatattata ttggtatttt aatcattgtg ttgtttgttt 

181 tattattagt aggccgtggg aagcttattt ttgttaataa aaaattatta tatcttgcta 

241 agatattagc tataccaaca attgttcttt tcctgtactc agtcttacta gacgtaatga 

301 acccagttga atttaatgga tattttagta gattatcaag tacgactatt tttggtttgt 

361 tagctatctt tcaagctata gttgtttttc aattttttgg acaaaaagta gtagattaca 

421 cttttacagc tatctccctc agctacttaa ccagtatcat tgttgccttt aggcagggag 

481 gacttagtca atttatcttg atactaacag atgatagttt caatggttcg gtactagaaa 

541 t 



Figure 58 
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1 tttttagaac atactcattt atttaaaagg 

61 tcgcgatact gtatatattg ctagtatact 

121 atcctgtaac gtacatgaca aaatattata 

181 tattattagt aggccgtggg aagcttattt 

241 agatattagc tataccaaca attgttcttt 

301 acccagttga atttaatgga tattttagta 

3 61 tagctatctt tcaagctata gttgtttttc 

4 21 cttttacagc tatctccctc agctacttaa 
4 81 gacttagtca atttatcttg atactaacag 
541 t 



aaataatagt gaaatttaaa tttaatccaa 
tagagttggc tacagatagg caacatctgt 
ttggtatttt aatcattgtg ttgtttgttt 
ttgttaataa aaaattatta tatcttgcta 
tcctgtactc agtcttacta gacgtaatga 
gattatcaag tacgactatt tttggtttgt 
aattttttgg acaaaaagta gtagattaca 
ccagtatcat. tgttgccttt aggcagggag 
atgatagttt caatggttcg gtactagaaa 



Figure 59 
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1 tttttagaac atactcattt atttaaaagg aaataatagt gaaatttaaa tttaatccaa 

61 tcgcgatact gtatatattg ctagtatact tagagttggc tacagatagg caacatctgt 

121 atcctgtaac gtacatgaca aaatattata ttggtatttt aatcattgtg ttgtttgttt 

181 tattattagt aggccgtggg aagcttattt ttgttaataa aaaattatta tatcttgcta 

241 agatattagc tataccaaca attgttcttt tcctgtactc agtcttacta gacgtaatga 

301 acccagttga atttaatgga tattttagta gattatcaag tacgactatt tttggtttgt 

361 tagctatctt tcaagctata gttgtttttc aattttttgg acaaaaagta gtagattaca 

4 21 cttttacagc tatctccctc agctacttaa ccagtatcat tgttgccttt aggcagggag 

481 gacttagtca atttatcttg atactaacag atgatagttt caatggttcg . gtactagaaa 

541 t ~ . 



Figure 60 
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1 tttttagaac atactcattt atttaaaagg 

61 tcgcgatact gtatatattg ctagtatact 

121 atcctgtaac gtacatgaca aaatattata 

181 tattattagt aggccgtggg aagcttattt 

241 agatattagc tataccaaca attgttcttt 

301 acccagttga atttaatgga tattttagta 

361 tagctatctt tcaagctata gttgtttttc 

421 cttttacagc tatctccctc agctacttaa 

4 81 gacttagtca atttatcttg atactaacag 

541 t 



aaataatagt gaaatttaaa tttaatccaa 
tagagttggc tacagatagg caacatctgt 
ttggtatttt aatcattgtg ttgtttgttt 
ttgttaataa aaaattatta tatcttgcta 
tcctgtactc agtcttacta gacgtaatga 
gattatcaag tacgactatt tttggtttgt 
aattttttgg acaaaaagta gtagattaca 
ccagtatcat tgttgccttt aggcagggag 
atgatagttt caatggttcg gtactagaaa 



Figure 61 
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1 tttttagaac atactcattt atttaaaagg a.agtaatagt gaaatttaaa tttaatccaa 

61 tcgcgatact gtatatattg ctagtatact tagagttggc tacagatagg caacatctgt 

121 atcctgtaac gtacatgaca aaatattata ttggtatttt aatcattgtg ttgtttgttt 

181 tattattagt aggccgtggg aagcttattt ttgttaataa aaaattatta tatcttgcta 

241 agatattagc tataccaaca attgttcttt tcctgtactc agtcttacta gacgtaatga 

301 4 acccagttga atttaatgga tattttagta ggttatcaag tacgactatt tttggtttgt 

361 * tagctatctt tcaagctata gttgtttttc aattttttgg acaaaaagta gtagattaca 

421 cttttacagc tatctccctc agctacttaa ccagtatcat tgttgccttt aggcagggag 

481 gacttagtca atttatcttg atactaacag atgatagttt caatggttcg gtactagaaa 

541 t 



Figure 62 
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tttttagaac 
tcgcgatact 
atcctgtaac 
tattattagt 
agatattagc 
acccagttga 
tagctatctt 
cttttacagc 
gacttagtca 
t 



atactcattt 
gtatatattg 
gtacatgaca 
aggccgtggg 
tataccaaca 
atttaatgga 
tcaagctata 
tatctccctc 
atttatcttg 



atttaaaagg 
ctagtatact 
aaatattata 
aagcttattt 
attgttcttt 
tattttagta 
gttgtttttc 
agctacttaa 
atactaacag 



aaataatagt 
tagagttggc 
ttggtatttt 
ttgttaataa 
tcctgtactc 
gattatcaag 
aattttttgg 
ccagtatcat 
atgatagttt 



gaaatttaaa 
tacagatagg 
aatcattgtg 
aaaattatta 
agtcttacta 
tacgactatt 
acaaaaagta 
tgttgccttt 
caatggttcg 



tttaatccaa 
caacatctgt 
ttgtttgttt 
tatcttgcta 
gacgtaatga 
tttggtttgt 
gtagattaca 
aggcagggag 
gtactagaaa 
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1 . 

61 

121 

181 

241 

301 

361 

421 

481 

541 



tttttagaac 
tcgcgatact 
atcctgtaac 
tattattagt 
agatattagc 
acccagttga 
tagctatctt 
cttttacagc 
gacttagtca 
t 



atactcattt 
gtatatattg 
gtacatgaca 
aggccgtggg 
tataccaaca 
atttaatgga 
tcaagctata 
tatctccctc 
atttatcttg 



atttaaaagg 
ctagtatact 
aaatattata 
aagcttattt 
attgttcttt 
tattttagta 
gttgtttttc 
agctacttaa 
atactaacag 



aaataatagt 
tagagttggc 
ttggtatttt 
ttgttaataa 
tcctgtactc 
gattatcaag 
aattttttgg 
ccagtatcat 
atgatagttt 



gaaatttaaa 
tacagatagg 
aatcattgtg 
aaaattatta 
agtcttacta 
tacgactatt 
acaaaaagta 
tgttgccttt 
caatggttcg 



tttaatccaa 
caacatctgt 
ttgtttgttt 
tatcttgcta 
gacgtaatga 
tttggtttgt 
gtagattaca 
aggcagggag 
gtactagaaa 



Figure 64 



85/87 



! 

-62 

! 
i 



+ Met 4 
+-68 

! + Mcst lOA-q 
! 

! + 
-67 +-65 
! ! + 

! +-64* 
! ! ! + 
+-66 +-63 
+ 



Mcst 33F-g 
Met 33A 



+-69 ! + 
! ! +-61 
! ' ! + 
60 

j 
j 

! + Met 37 
+-59 

+ Mcst 23A-ca 



Met 35B 
Mcst 17F-35B 
Mcst 17F-C 
Met 34 



•66 
! 
! 

+ 



Met 17A 
+ Met 35F 



+ 
! 
! 
! 

-56 

! 
i 

! + Mcst 6B-g 
+-57 

+ Mcst 6A-6B-g 



+ 
! 
! 

! 

+-53 
! ! 



+ 
! 
! 
I 

■45 

! 
! 
! 

+■ 



! 
! 

-52 
! 
! 
I 

+ 



! 

•48 
! 
\ 



+.Mct 7C 
+-49 

* +-50 + Mcst 15A-cal 
! ! 

+-51 + Met 9V 
! ! 

+ Mcst 6B-C 



+ Met 22 F 
+-46 

+ Mcst 15B-22F 



+-47 
! 

+ Met 22A 
Met 21 



+ Mcst 33F-q 
+-43 
44 + Met 33B 
! 



Met 10F 



+-42 



! ! 
+-54 ! 
! f + 
-55 ! 

! + Met 12F 
f 

+ Mcst 2-q 



+ Met 42 
+-39 
+-40 + Met 31 

! .! 

+-41 + Mcst 5-c 

! !. 



+ Mcst llA-nz 



Continue next page 



ft 



86/87 



+-38 + Met 8 
! 

! + Met 41F 
+-37 

+-58 ! + Mcst 2-g 

+ Mcst 23F-g 
+-30 

+-31 + Mcst 10A-23F 
! ! 

! + Met 14 -g 

+-36 +-29 

1 ! + Mcst 15C-q 

! ! +-27 
+-32 +-28 + Mcst 15B-q 
! ! ! 

! ! + Mcst 15C-ca 

+-26 I 

! ! + Met 29 
! ! 

+-33 ! + Mcst 23F-23A 
! ! +-25 

! ! + Mcst 23A-23F 

+-22 ! +-34 ! 

! ! + Met 7F 
+-35 ! 

! ! + Mcst 14-c 
! ! 

+-24 + Mcst 5-q 
i 

! + Met 16F 
+-23 

+ Mcst 15B-c 

+ Met 20 ■ 
+-20 
+-21 + Met 13 
+-19 ! 
! ! + Met 9N 

+ Met 18C 
+-14 
+-15 + Met 18B 
! ! 

+-16 + Met 19F 
; i 

+-17 + Met 18F 
! ! 
+-18 + Met 1 



+-70 



+ — 



+-13 



! 

+ Met 18A 



+ Mcst 23F-C 
+-12 

! + Mcst 15A-ca2 
+-11 

! + Mcst 6A-ca 

1 +—9 

+-10 + Mcst 6A-c 
\ 

+ Mcst 6A-g 
+ — Mcst llA-q 



Continue next page 



87/87 



| + Mct 3 

! + Mcst 6B-q 

! +—5 

j + — 6 + Mcst 6A-6B-q 
! ! ! 

+—4 + Mcst 11B 
! 

! + Met 23B 
+--3 

+ Met 19A 

+ Met 38 

1 

+ Met 25 

Nontypable-nz 

Nont ypabl e - c a 



Figure 65 



Document made available under the 
Patent Cooperation Treaty (PCT) 



International application number: PCT/AU04/000480 
International filing date: 13 April 2004 (13.04.2004) 

Document type: Certified copy of priority document 

Document details: Country/Office: AU 

Number: 2003901717 

Filing date: 10 April 2003 (10.04.2003) 

Date of receipt at the International Bureau: 15 March 2005 (15.03.2005) 

Remark: Priority document submitted or transmitted to the International Bureau in 
compliance with Rule 17.1(a) or (b) 




World Intellectual Property Organization (WTPO) - Geneva, Switzerland 
Organisation Mondiale de la Propriete Intellectuelle (OMPI) - Geneve, Suisse 



